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PREFACE 


ANALYTIC COMBINATORICS aims at predicting precisely the properties of large 
structured combinatorial configurations, through an approach based extensively on 
analytic methods. Generating functions are the central objects of the theory. 

Analytic combinatorics starts from an exact enumerative description of combina- 
torial structures by means of generating functions, which make their first appearance 
as purely formal algebraic objects. Next, generating functions are interpreted as an- 
alytic objects, that is, as mappings of the complex plane into itself. Singularities 
determine a function’s coefficients in asymptotic form and lead to precise estimates 
for counting sequences. This chain applies to a large number of problems of discrete 
mathematics relative to words, trees, permutations, graphs, and so on. A suitable adap- 
tation of the methods also opens the way to the quantitative analysis of characteristic 
parameters of large random structures, via a perturbational approach. 

Analytic combinatorics can accordingly be organized based on three components: 


Symbolic Methods develops systematic relations between some of the major 
constructions of discrete mathematics and operations on generating func- 
tions which exactly encode counting sequences. 

Complex Asymptotics elaborates a collection of methods by which one can 
extract asymptotic counting information from generating functions, once 
these are viewed as analytic transformations of the complex domain. Singu- 
larities then appear to be a key determinant of asymptotic behaviour. 
Random Structures concerns itself with probabilistic properties of large ran- 
dom structures. Which properties hold with high probability? Which laws 
govern randomness in large objects? In the context of analytic combina- 
torics, these questions are treated by a deformation (adding auxiliary vari- 
ables) and a perturbation (examining the effect of small variations of such 
auxiliary variables) of the standard enumerative theory. 


THE APPROACH to quantitative problems of discrete mathematics provided by 
analytic combinatorics can be viewed as an operational calculus for combinatorics. 
The present book exposes this view by means of a very large number of examples 
concerning classical combinatorial structures—most notably, words, trees, composi- 
tions, partitions, permutations, mapings, allocations, planar maps, and graphs. The 
eventual goal is an effective way of quantifying metric properties of large random 
structures. 

Given its capacity of quantifying properties of large discrete structures, Analytic 
Combinatorics is susceptible to many applications, within combinatorics itself, but, 
perhaps more importantly, within other areas of science where discrete probabilistic 
models recurrently surface, like statistical physics, computational biology, electrical 


engineering, and information theory. Last but not least, the analysis of algorithms 
and data structures in computer science has served and still serves as an important 
motivation in the development of the theory. 
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Part A: Symbolic Methods. This part specifically exposes Symbolic Methods, 
which is a unified algebraic theory dedicated to setting up functional relations be- 
tween counting generating functions. As it turns out, a collection of general (and 
simple) theorems provide a systematic translation mechanism between combinato- 
rial constructions and operations on generating functions. This translation process is 
a purely formal one. Precisely, as regards basic counting, two parallel frameworks 
coexist—one for unlabelled structures and ordinary generating functions, the other 
for labelled structures and exponential generating functions. Furthermore, within the 
theory, parameters of combinatorial configurations can be easily taken into account 
by adding supplementary variables. Three chapters then compose this part: Chapter I 
deals with unlabelled objects; Chapter II develops in a parallel way labelled objects; 
Chapter III treats multivariate aspects of the theory suitable for the analysis of param- 
eters of combinatorial structures. 
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Part B: Complex asymptotics. This part specifically exposes Complex Asymp- 
totics, which is a unified analytic theory dedicated to the process of extracting as- 
ymptotic information from counting generating functions. A collection of general 
(and simple) theorems provide a systematic translation mechanism between gener- 
ating functions and asymptotic forms of coefficients. Four chapters compose this 
part. Chapter IV serves as an introduction to complex-analytic methods and proceeds 
with the treatment of meromorphic functions, that is, functions whose singularities are 
poles, rational functions being the simplest case. Chapter V develops applications of 
rational and meromorphic asymptotics of generating functions, with numerous appli- 
cations related to words and languages, walks and graphs, as well as permutations. 
Chapter VI develops a general theory of singularity analysis that applies to a wide 
variety of singularity types, such as square-root or logarithmic, and has applications 
to trees as well as to other recursively defined combinatorial classes. Chapter VII 
presents applications of singularity analysis to 2-regular graphs and polynomials, 
trees of various sorts, mappings, context-free languages, walks, and maps. It contains 
in particular a discussion of the analysis of coefficients of algebraic functions. Chap- 
ter VII explores saddle point methods, which are instrumental in analysing functions 
with a violent growth at a singularity, as well as many functions with only a singularity 
at infinity (i.e., entire functions). 
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Part C: Random Structures. This part includes Chapter IX dedicated to the 
analysis of multivariate generating functions viewed as deformation and perturbation 
of simple (univariate) functions. As a consequence, many important characteristics 
of classical combinatorial structures can be precisely quantified in distribution. Chap- 
ter ?? is an epilogue, which offers a brief recapitulation of the major asymptotic prop- 
erties of discrete structures developed in earlier chapters. 


iii 
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Part D: Appendices. Appendix A summarizes some key elementary concepts of 
combinatorics and asymptotics, with entries relative to asymptotic expansions, lan- 
guages, and trees, amongst others. Appendix B recapitulates the necessary back- 
ground in complex analysis. It may be viewed as a self-contained minicourse on 
the subject, with entries relative to analytic functions, the Gamma function, the im- 
plicit function theorem, and Mellin transforms. Appendix C recalls some of the basic 
notions of probability theory that are useful in analytic combinatorics. 
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THIS BOOK is meant to be reader-friendly. Each major method is abundantly il- 
lustrated by means of concrete examples! treated in detail—there are scores of them, 
spanning from a fraction of a page to several pages—offering a complete treatment of a 
specific problem. These are borrowed not only from combinatorics itself but also from 
neighbouring areas of science. With a view of addressing not only mathematicians of 
varied profiles but also scientists of other disciplines, Analytic Combinatorics is self- 
contained, including ample appendices that recapitulate the necessary background in 
combinatorics and complex function theory. A rich set of short Notes—there are more 
than 250 of them—are inserted in the text’ and can provide exercises meant for self- 
study or for students’ practice, as well as introductions to the vast body of literature 
that is available. We have also made every effort to focus on core ideas rather than 
technical details, supposing a certain amount of mathematical maturity but only basic 
prerequisites on the part of our gentle readers. The book is also meant to be strongly 
problem-oriented, and indeed it can be regarded as a manual, or even a huge algorithm, 
guiding the reader to the solution of a very large variety of problems regarding dis- 
crete mathematical models of varied origins. In this spirit, many of our developments 
connect nicely with computer algebra and symbolic manipulation systems. 


COURSES can be (and indeed have been) based on the book in various ways. 
Chapters I-III on Symbolic Methods serve as a systematic yet accessible introduction 
to the formal side of combinatorial enumeration. As such it organizes transparently 
some of the rich material found in treatises* like those of Bergeron-Labelle-Leroux, 
Comtet, Goulden-Jackson, and Stanley. Chapters [V—VIII relative to Complex Asymp- 
totics provide a large set of concrete examples illustrating the power of classical com- 
plex analysis and of asymptotic analysis outside of their traditional range of applica- 
tions. This material can thus be used in courses of either pure or applied mathematics, 
providing a wealth of nonclassical examples. In addition, the quiet but ubiquitous 
presence of symbolic manipulation systems provides a number of illustrations of the 
power of these systems while making it possible to test and concretely experiment 
with a great many combinatorial models. Symbolic systems allow for instance for 
fast random generation, close examination of non-asymptotic regimes, efficient ex- 
perimentation with analytic expansions and singularities, and so on. 


Examples are marked by “EXAMPLE --- 
2Notes are indicated by > --- <J. 


3References are to be found in the bibliography section at the end of the book. 


Our initial motivation when starting this project was to build a coherent set of 
methods useful in the analysis of algorithms, a domain of computer science now well- 
developed and presented in books by Knuth, Hofri, Mahmoud, and Szpankowski, in 
the survey by Vitter—Flajolet, as well as in our earlier Introduction to the Analysis of 
Algorithms published in 1996. This book can then be used as a systematic presenta- 
tion of methods that have proved immensely useful in this area; see in particular the 
Art of Computer Programming by Knuth for background. Studies in statistical physics 
(van Rensburg, and others), statistics (e.g., David and Barton) and probability theory 
(e.g., Billingsley, Feller), mathematical logic (Burris’ book), analytic number theory 
(e.g., Tenenbaum), computational biology (Waterman’s textbook), as well as informa- 
tion theory (e.g., the books by Cover-Thomas, MacKay, and Szpankowski) point to 
many startling connections with yet other areas of science. The book may thus be 
useful as a supplementary reference on methods and applications in courses on statis- 
tics, probability theory, statistical physics, finite model theory, analytic number theory, 
information theory, computer algebra, complex analysis, or analysis of algorithms. 
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An invitation to Analytic 
Combinatorics 
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— PLATO, The Timaeus! 


ANALYTIC COMBINATORICS is primarily a book about Combinatorics, that is, 
the study of finite structures built according to a finite set of rules. Analytic in the title 
means that we concern ourselves with methods from mathematical analysis, in partic- 
ular complex and asymptotic analysis. The two fields, combinatorial enumeration and 
complex asymptotics, are organized into a coherent set of methods for the first time 
in this book. Our broad objective is to discover how the continuous may help us to 
understand the discrete and to quantify its properties. 


COMBINATORICS is as told by its name the science of combinations. Given ba- 
sic rules for assembling simple components, what are the properties of the resulting 
objects? Here, our goal is to develop methods dedicated to quantitative properties of 
combinatorial structures. In other words, we want to measure things. Say that we 
have n different items like cards or balls of different colours. In how many ways 
can we lay them on a table, all in one row? You certainly recognize this counting 
problem—finding the number of permutations of n elements. The answer is of course 
the factorial number, n! = 1-2---n. This is a good start, and, equipped with patience 
or a calculator, we soon determine that if m = 31, say, then the number is the rather 
large” 


31! = 8222838654177922817725562880000000 = 0.8222838654 - 10**. 


The factorials solve an enumerative problem, one that took mankind some time to sort 
out, because the sense of the ‘: - -’ in the formula is not that easily grasped. In his book 


|«<S6 their combinations with themselves and with each other give rise to endless complexities, which 
anyone who is to give a likely account of reality must survey.” Plato speaks of Platonic solids viewed as 
idealized primary constituents of the physical universe. 

2 We use ‘a = dto represent a numerical estimation of the real a by the decimal d, with the last digit 
being at most +1 from its actual value. 
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The Art of Computer Programming (vol Hl, p. 23), Donald Knuth traces the discovery 
to the Hebrew Book of Creation (c. A.D. 400) and the Indian classic Anuyogadvara- 
sutra (c. A.D. 500). 

Here is another more subtle problem. Assume that you are interested in permuta- 
tions such that the first element is smaller than the second, the second is larger than the 
third, itself smaller than the fourth, and so on. The permutations go up and down and 
they are diversely known as up-and-down or zigzag permutations, the more dignified 
name being alternating permutations. Say that n = 2m + 1 is odd. An example is for 
n=9: 


Boe ra eX oS 
4 6 5 1 2 
The number of alternating permutations for n = 1,3,5,... turns out to be 
1, 2, 16, 272, 7936, 353792, 22368256,.... 


What are these numbers and how do they relate to the total number of permutations of 
corresponding size? A glance at the corresponding figures, that is, 1!,3!,5!,... or 


1,6, 120, 5040, 362880, 39916800, 6227020800, ... 


suggests that the factorials grow somewhat faster—just compare the lengths of the last 
two displayed lines. But how and by how much? This is the prototypical question we 
are addressing in this book. 

Let us now examine the counting of alternating permutations. In 1881, the French 
mathematician Désiré André made a startling discovery. Look at the first terms of the 
Taylor expansion of the trigonometric function tan(z): 
¥ 2 Pi 79 gil 
tan z= 15 +2 31 + 167 + 2727 + AOS 0 ee + 353792-75 se 
The counting sequence for alternating permutations curiously surfaces. We say that 
the function on the left is a generating function for the numerical sequence (precisely, 
a generating function of the exponential type due to the presence of factorials in the 
denominators). 

André’s derivation may nowadays be viewed very simply as reflecting the con- 
struction of permutations by means of certain binary trees: Given a permutation 0 a 
tree can be obtained once o has been decomposed as a triple (o,, max, oR), by tak- 
ing the maximum element as the root, and appending, as left and right subtrees, the 
trees recursively constructed from oy and op. Part A of this book develops at length 
symbolic methods by which the construction of the class T of all such trees, 


23 


T = 1 + (T,max,T) 
translates into an equation relating generating functions, 
T(z) = z + if Tew? aw. 
0 


In this equation, T(z) := )°,, T,2"/n! is the exponential generating function of the 
sequence (T;,,), where T;, is the number of alternating permutations of (odd) length n. 
There is a compelling formal analogy between the combinatorial specification and the 
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world of generating functions: Unions (U) give rise to sums (++), max-placement gives 
an integral ({’), forming a pair of trees means taking a square ([-]°). 
At this stage, we know that T(z) must solve the differential equation 


d 
=P lz) =1+ Lia TO) 0, 
z 
which, by classical manipulations, yields T(z) = tan z. The generating function then 


provides a simple algorithm to compute recurrently the coefficients, since the formula, 


3 
sie 2 See 


ed Z2 ZA ’ 
COS Z fo Seige 


tan z = 


implies (n odd) 
_{” n eee ee OV ios lcs 2 
Te (5) t-2 + @ Ee 4 (-1) , where (5) tabi 


is the conventional notation for binomial coefficients. At this stage, the exact enumer- 
ative problem may be regarded as solved since a very simple recurrent algorithm is 
available for determining the counting sequence, while the generating function admits 
an explicit expression in terms of a well known function. 


ANALYSIS, by which we mean mathematical analysis, is often described as the 
art and science of approximation. How fast do the factorial and the tangent number 
sequences grow? What about comparing their growths? These are typical problems 
of analysis. 

First, consider the number of permutations, n!. Quantifying the growth of these 
numbers as n gets large takes us to the realm of asymptotic analysis. The way to 
express factorial numbers in terms of elementary functions is known as Stirling’s for- 
mula, 


"VS2rn 


nin ne” ; 
where the ~ sign means “approximately equal” (in fact, in the precise sense that the 
ratio of both terms tends to | as n gets large). This beautiful formula, associated with 
the name of the eighteenth century Scottish mathematician James Stirling, curiously 
involves both the basis e of natural logarithms and the perimeter 27 of the circle. 


Certainly, you cannot get such a thing without analysis. As a first step, there is an 


estimate for 
logn! = Ylogj ~ | logade ~ nlog (“), 
j=l . . 


explaining at least the n”e~” term, but already requiring some amount of elementary 
calculus. (Stirling’s formula precisely came a few decades after the fundamental bases 
of calculus had been laid by Newton and Leibniz.) Note the usefulness of Stirling’s 
formula: it tells us almost instantly that 100! has 158 digits, while 1000! borders the 
astronomical 10?°°8. 

We are now left with estimating the growth of the sequence of tangent numbers, 
Tn. The analysis leading to the derivation of the generating function tan(z) has been 
so far essentially algebraic or “formal”. Well, we can plot the graph of the tangent 
function, for real values of its argument and see that the function becomes infinite 
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FIGURE .1. Two views of the function z +> tan z: (left) a plot for real values 
of z € [—5..5]; (right) the modulus | tan z| when z is assigned complex values in 
the square +2.25 + 2.25,/—1. 


at the points +4, +34, and so on (Figure 1). Such points where a function ceases 
to be smooth are called singularities. By methods amply developed in this book, it 
is the local nature of a generating function at its “dominant” singularities (i.e., the 
ones closest to the origin) that determines the asymptotic growth of the sequence of 
coefficients. In this perspective, the basic fact that tan z has dominant singularities at 
+2 enables us to reason as follows: first approximate the generating function tan z 


near its two dominant singularities, namely, 


8z 


t Se aes 
my) z—rtn/2 mn = 422’ 


then extract coefficients of this approximation; finally, get in this way a valid approx- 
imation of coefficients: 
vam ae 
— ~ 2- (2) (n odd). 
T 


With present day technology, we also have available symbolic manipulation sys- 
tems (also called “computer algebra” systems) and it is not difficult to verify the ac- 


curacy of our estimates. Here is a small pyramid for n = 3,5,...,21, 
2|1 
16] 15 
272 | 271 
7936 | 7935 


353792 | 353791 
22368256 | 2236825 1 
1903757312 | 1903757 267 
209865342976 | 20986534 2434 
29088885112832 | 290888851 04489 
495 1498053124096 | 495149805 2966307 
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FIGURE .2. The collection of all binary trees for sizes n = 2,3,4,5 with re- 
spective cardinalities 2,5, 14, 42. 


comparing the exact values of T;,, against the approximations T’*, where (n odd) 


D) n+1 
Tx := |2-n! (=) F 
TT 


and discrepant digits of the approximation are displayed in bold. For n = 21, the error 
is only of the order of one in a billion. Asymptotic analysis is in this case wonderfully 
accurate. 

In the foregoing discussion, we have played down a fact, and an important one. 
When investigating generating functions from an analytic standpoint, one should gen- 
erally assign complex values to arguments not just real ones. It is singularities in 
the complex plane that matter and complex analysis is needed in drawing conclu- 
sions regarding the asymptotic form of coefficients of a generating function. Thus, 
a large portion of this book relies on a complex analysis technology, which starts to 
be developed in Part B of the book titled Complex Asymptotics. This approach to 
combinatorial enumeration parallels what happened in the nineteenth century, when 
Riemann first recognized the deep relation between complex-analytic properties of the 
zeta function, ¢(s) := > 1/n®, and the distribution of primes, eventually leading to 
the long-sought proof of the Prime Number Theorem by Hadamard and de la Vallée- 
Poussin in 1896. Fortunately, relatively elementary complex analysis suffices for our 
purposes, and we can include in this book a complete treatment of the fragment of the 
theory needed to develop the bases of analytic combinatorics. 


Here is yet another example illustrating the close interplay between combina- 
torics and analysis. When discussing alternating permutations, we have enumerated 
binary trees bearing distinct integer labels that satisfy a constraint—to increase along 
branches. What about the simpler problem of determining the number of possible 
shapes of binary trees? Let C;, be the number of binary trees that have n binary 
branching nodes, hence n + 1 “external nodes’. It is not hard to come up with an 
exhaustive listing for small values of n; see Figure 2, from which we determine that 


Cy =1,C1, =1, Co =2, C3 =5, Ch = 14, Cy = 42. 


These numbers are probably the most famous ones of elementary combinatorics. They 
have come to be known as the Catalan numbers as a tribute to the Belgian French 
mathematician Eugéne Charles Catalan (1814-1894), but they already appear in works 
of Euler and Segner in the second half of the eighteenth century. In his reference 
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treatise on Enumerative Combinatorics, Stanley lists over twenty pages a collection of 
some 66 different types of combinatorial structures that are enumerated by the Catalan 
numbers. 

First, one can write a combinatorial equation, very much in the style of what has 
been done earlier, but without labels: 


C= 0 + (s,C). 


With symbolic methods, it is easy to see that the ordinary generating function of the 
Catalan numbers defined as 
C(z) := S- Cae, 


n>0 
satisfies an equation that is a direct reflection of the combinatorial definition, namely, 
Cz) = 1 + 20CG. 


This is a quadratic equation whose solution is 


1—-V1-4z 


(ei — 

(z) oe 

Then, by means of Newton’s theorem relative to the expansion of (1 + x), one finds 
easily (x = —4z,a = 4) the closed form expression 


bee 1 Ga 
nt+1l\n 


Regarding asymptotic approximation, Stirling’s formula comes to the rescue: it 

implies 

4” 
Vins 
This approximation is quite usable: it predicts Cf = 2.25 (whereas C = 1), which 
is off by a factor of 2, but the error drops to 10% already for n = 10, and it appears to 
be less than 1% for any n > 100. 

A plot of the generating function C'(z) in Figure 3 illustrates the fact that C'(z) has 
a singularity at z = + as it ceases to be differentiable (its derivative becomes infinite). 
That singularity is quite different from a pole and for natural reasons it is known as 
a square-root singularity. As we shall see repeatedly, under suitable conditions in 
the complex plane, a square root singularity for a function at a point p invariably 
entails an asymptotic form p~"n—°/? for its coefficients. More generally, it suffices 
to estimate a generating function near a singularity in order to deduce an asymptotic 
approximation of its coefficients. This correspondence is a major theme of the book, 
one that motivates the four central chapters. 

A consequence of the complex-analytic vision of combinatorics is the detection of 
universality phenomena in large random structures. (The term is originally borrowed 
from statistical physics and is nowadays finding increasing use in areas of mathe- 
matics like probability theory.) By universality is meant here that many quantitative 
properties of combinatorial structures only depend on a few global features of their 
definitions, not on details. For instance a growth in the counting sequence of the form 


C. A n 3/2, 


Ch ~ Ch where Cf := 
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FIGURE .3. Left: the real values of the Catalan generating function, which has a 
square-root singularity at z = +. Right: the ratio C"/ (4"n—/?) plotted together 
with its asymptote at 1/,/7 = 0.56418. 


arising from a square-root singularity, will be shown to be universal across all vari- 
eties of trees determined by a finite set of allowed node degrees—this includes unary- 
binary trees, ternary trees, 0-11-13 trees, as well as many variations like nonplane 
trees and labelled trees. Even though generating functions may become arbitrarily 
complicated—like an algebraic function of a very high degree or even the solution to 
an infinite functional equation—it is still possible to extract with relative ease global 
asymptotic laws governing counting sequences. 


RANDOMNESS is another ingredient in our story. How useful is it to determine, 
exactly or approximately, counts that may be so large as to require hundreds if not 
thousands of digits in order to be written down? Take again the example of alternating 
permutations. When estimating their number, we have indeed quantified the propor- 
tion of these amongst all permutations. In other words, we have been predicting the 
probability that a random permutation of some size n is alternating. Results of this sort 
are of interest in all branches of science. For instance, biologists routinely deal with 
genomic sequences of length 10°, and the interpretation of data requires developing 
enumerative or probabilistic models where the number of possibilities is of the order 
of 41°”. The language of probability theory then proves a great convenience when 
discussing characteristic parameters of discrete structures, as we can interpret exact 
or asymptotic enumeration results as saying something concrete about the likeliness 
of values that such parameters assume. Equally important of course are results from 
several areas of probability theory: as demonstrated in the later sections of this book, 
such results merge extremely well with the analytic-combinatorial framework. 

Say we are now interested in runs in permutations. These are the longest frag- 
ments of a permutation that already appear in (increasing) sorted order. Here is a 
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permutation where runs have been separated by vertical bars: 
258|39|147|6]. 


Runs naturally present in a permutation are for instance exploited by a sorting algo- 
rithm called “natural list mergesort’, which builds longer and longer runs, starting 
from the original ones and merging them until the permutation is eventually sorted. 
For our understanding of this algorithm, it is then of obvious interest to quantify how 
many runs a permutation is likely to have. 

Let A,,,, be the number of permutations of size n having k runs. Then, the prob- 
lem is once more best approached by generating functions and one finds that the coef- 
ficient of u*z” inside the bivariate generating function, 

2 3 

a =1+ ut Sut 1) + Fw +4ut 1) +-~ 
gives the sought numbers A,,;,/n!. (A simple way of establishing this formula bases 
itself on the tree decomposition of permutations and on the symbolic method.) From 
there, we can easily determine effectively the mean, variance, and even the higher 
moments of the number of runs that a random permutation has: it suffices to expand 
blindly, or even better with the help of a computer, the bivariate generating function 
above as u — 1: 


1 1 z(2-—2z) 
i-2' 20-4" 9 *2” Gay 


When wu = 1, we just enumerate all permutations: this is the constant term 1/(1 — z) 
equal to the exponential generating function of all permutations. The coefficient of 
u — 1 gives the generating function of the mean number of runs, the next one gives 
access to the second moment, and so on. In this way, we discover that the expectation 
and standard deviation of the number of runs in a permutation of size n evaluate to 


nti _ In+1 
Ln = 2 ry On = 12 . 


Then by easy analytic-probabilistic inequalities (Chebyshev inequalities) that other- 
wise form the basis of what is known as the second moment method, we learn that the 
distribution of the number of runs is concentrated around its mean: in all likelihood, 
if one takes a random permutation, the number of its runs is going to be very close to 
its mean. The effects of such quantitative laws are quite tangible. It suffices to draw a 
sample of one element for n = 30 to get something like 
13, 22, 29|12, 15, 23/8, 28]18]6, 26|4, 10, 16|1, 27/3, 14, 17, 20|2, 21, 30/25]11, 19)9|7, 24. 
For n = 30, the mean is 15, and this sample comes rather close as it has 13 runs. 
We shall furthermore see in Chapter [X that even for moderately large permutations 
of size 10,000 and beyond, the probability for the number of observed runs to deviate 
by more than 10% from the mean is less than 10~°°. As witnessed by this example, 
much regularity accompanies properties of large combinatorial structures. 
More refined methods combine the observation of singularities with analytic re- 
sults from probability theory (e.g., continuity theorems for characteristic functions). 
In the case of runs in permutations, the quantity F'(z,w) viewed as a function of z 


1 2? (6-424 27) 
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FIGURE .4. Left: A partial plot of the real values of the inverse 1/F'(z, u) 
for u = 0.1..2, with F the bivariate generating function of Eulerian numbers, 
illustrates the presence of a movable pole for F’. Right: A diagram showing the 
distribution of the number of runs in permutations for n = 6. .60. 


when wu is fixed appears to have a pole: this fact is apparent on Figure 4 [left] since 
1/F has a zero at some z = p(u) where p(1) = 1. Then we are confronted with 
a fairly regular deformation of the generating function of all permutations. A pa- 
rameterized version (with parameter uw) of singularity analysis then gives access to a 
description of the asymptotic behaviour of the Eulerian numbers A,,,,. This enables 
us to describe very precisely what goes on: In a random permutation of large size n, 
once centred by its mean and scaled by its standard deviation, the distribution of the 
number of runs is asymptotically gaussian; see Figure 4 [right]. 

A somewhat similar type of situation prevails for binary trees, despite the fact 
that the counting sequences and the counting generating functions look rather differ- 
ent from their permutation counterparts. Say we are interested in leaves (also some- 
times known as “cherries”) in trees: these are binary nodes that are attached to two 
external nodes (0). Let C;,,, be the number of trees of size n having k leaves. The 
bivariate generating function C(z, u) := >>, , Cn,n2”u* encodes all the information 
relative to leaf statistics in random binary trees. A modification of previously seen 
symbolic arguments shows that C'(z, u) still satisfies a quadratic equation resulting in 
the explicit form, 


_ fl-&@ +410) 
Olz,u) 1—f1—424+ 4270 -w) 


an 2z 


This reduces to C(z) for u = 1, as it should, and the bivariate generating func- 
tion C(z,u) is a deformation of C(z) as u varies. In fact, the network of curves 
of Figure 5 for several fixed values of u shows that there is a smoothly varying square- 
root singularity. It is possible to analyse the perturbation induced by varying values 
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FIGURE .5. Left: The bivariate generating function C(z, wu) enumerating binary 
trees by size and number of leaves exhibits consistently a square-root singularity 
as a function of z for several values of u. Right: a binary tree of size 300 drawn 
uniformly at random has 69 leaves or “cherries”. 


of u, to the effect that C'(z, u) is of the global analytic type 
A(u)- ,/1- —, 


for some analytic \(w) and p(u). The already evoked process of singularity analysis 
then shows that the probability generating function of the number of leaves in a tree 
of size n satisfies an approximation of the form 


(i) | (oa) (1+ 0(1)). 


This “quasi-powers” approximation thus resembles very much the probability 
generating function of a sum of n independent random variables, a situation that re- 
sorts to the classical Central Limit Theorem of probability theory. Accordingly, the 
limit distribution of the number of leaves in a large tree is Gaussian. In abstract terms, 
the deformation induced by the secondary parameter (here, the number of leaves, pre- 
viously, the number of runs) is susceptible to a perturbation analysis, to the effect that 
a singularity gets smoothly displaced without changing its nature (here, a square root 
singularity, earlier a pole) and a limit law systematically results. Again some of the 
conclusions can be verified even by very small samples: the single tree of size 300 
drawn at random and displayed in Figure 5 has 69 cherries while the expected value 
of this number is = 75.375 and the standard deviation is a little over 4. In a large 
number of cases of which this one is typical, we find metric laws of combinatorial 
structures that govern large structures with high probability and eventually make them 
highly predictable. 

Such randomness properties form the subject of Part C of this book dedicated to 
random structures. As our earlier description implies, there is an extreme degree of 
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FIGURE .6. The logical structure of Analytic Combinatorics. 


generality in this analytic approach to combinatorial parameters, and after reading this 
book, the reader will be able to recognize by herself dozens of such cases at sight, and 
effortlessly establish the corresponding theorems. 


A RATHER ABSTRACT VIEW of combinatorics emerges from the previous discus- 
sion; see Figure 6. A combinatorial class, as regards its enumerative properties, can 
be viewed as a surface in four-dimensional real space: this is the graph of its gener- 
ating function, considered as a function from the set C Y R? of complex numbers to 
itself, and is otherwise known as a Riemann surface. This surface has “cracks”, that 
is, singularities, which determine the asymptotic behaviour of the counting sequence. 
A combinatorial construction (like forming freely sequences, sets, and so on) can then 
be examined based on the effect it has on singularities. In this way, seemingly differ- 
ent types of combinatorial structures appear to be subject to common laws governing 
not only counting but also finer characteristics of combinatorial structures. For the 
already discussed case of universality in tree enumerations, additional universal laws 
valid across many tree varieties constrain for instance height (which, with high prob- 
ability, is proportional to the square-root of size) and the number of leaves (which is 
invariably normal in the asymptotic limit). 

Next, the probabilistic behaviour of a parameter of a combinatorial class is fully 
determined by a bivariate generating function, which is a deformation of the basic 
counting generating function of the class. (In the sense that setting the secondary 
variable u to | erases the information relative to the parameter and leads back to 
the univariate counting generating function.) Then, the asymptotic distribution of a 
parameter of interest is characterized by a collection of surfaces, each having its own 
singularities. The way the singularities’ locations move or their nature changes under 
deformation encodes all the necessary information regarding the distribution of the 
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parameter under consideration. Limit laws for combinatorial parameters can then be 
obtained and the corresponding phenomena can be organized into broad categories, 
called schemas. It would not be conceivable to attain such a far-reaching classification 
of metric properties of combinatorial structures by elementary real analysis alone. 


OBJECTS to which we are going to inflict the treatments just described include 
many of the most important ones of discrete mathematics, also the ones that surface 
recurrently in several branches of the applied sciences. We shall thus encounter words 
and sequences, trees and lattice paths, graphs of various sorts, mappings, allocations, 
permutations, integer partitions and compositions, and planar maps, to name a few. 
In most cases, their principal characteristics will be finely quantified by the methods 
of analytic combinatorics; see our concluding Chapter ?? for a summary. This book 
indeed develops a coherent theory of random combinatorial structures based on a pow- 
erful analytic methodology. Literally dozens of quite diverse combinatorial types can 
then be treated by a logically transparent chain. You will not find ready-made answers 
to all questions in this book, but, hopefully, methods that can be successfully used to 
address a great many of them. 


Part A 


SYMBOLIC METHODS 


Combinatorial Structures and 
Ordinary Generating Functions 


Laplace discovered the remarkable correspondence between 

set theoretic operations and operations on formal power series 

and put it to great use to solve a variety of combinatorial problems. 
— GIAN-CARLO ROTA [414] 


Contents 
I.1. Symbolic enumeration methods 16 
I.2. Admissible constructions and specifications 23 
I.3. Integer compositions and partitions 37 
1.4. Words and regular languages 47 
I.5. Tree structures 61 
1.6. Additional constructions 77 
I.7. Perspective 84 


This chapter and the next are devoted to enumeration, where the problem is to deter- 
mine the number of combinatorial configurations described by finite rules, and do so 
for all possible sizes. For instance, how many different words are there of length 17? 
of length n, for general n? These questions are easy, but what if some constraints 
are imposed, e.g., no four identical elements in a row? The counting sequences are 
exactly encoded by generating functions, and, as we shall see, generating functions 
are the central mathematical object of combinatorial analysis. We examine here a 
framework that, contrary to traditional treatments based on recurrences, explains the 
surprising efficiency of generating functions in the solution of combinatorial enumer- 
ation problems. 

This chapter serves to introduce the symbolic approach to combinatorial enumer- 
ations. The principle is that many general set-theoretic constructions admit a direct 
translation as operations over generating functions. This principle is made concrete 
by means of a dictionary that includes a collection of core constructions, namely the 
operations of union, cartesian product, sequence, set, multiset, and cycle. Supple- 
mentary operations like pointing and substitution can be also be similarly translated. 
In this way, a language describing elementary combinatorial classes is defined. The 
problem of enumerating a class of combinatorial structures then simply reduces to 
finding a proper specification, a sort of program for the class expressed in terms of the 
basic constructions. The translation into generating functions then becomes a purely 
mechanical symbolic process. 

We show here how to describe integer partitions and compositions in such a con- 
text, as well as several basic string and tree enumeration problems. A parallel ap- 
proach, developed in Chapter II, applies to labelled objects and exponential generating 
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functions—in contrast the plain structures considered in this chapter are called unla- 
belled. The methodology is susceptible to multivariate extensions with which many 
characteristic parameters of combinatorial objects can also be analysed in a unified 
manner: this is to be examined in Chapter III. The symbolic method also has the great 
merit of connecting nicely with complex asymptotic methods that exploit analyticity 
properties and singularities, to the effect that precise asymptotic estimates are usually 
available whenever the symbolic method applies—a systematic treatment of these as- 
pects forms the basis of Part B of this book Complex Asymptotics (Chapters [V-VII]). 


I.1. Symbolic enumeration methods 


First and foremost, combinatorics deals with discrete objects, that is, objects that 
can be finitely described by construction rules. Examples are words, trees, graphs, 
permutations, allocations, functions from a finite set into itself, topological configu- 
rations, and so on. A major question is to enumerate such objects according to some 
characteristic parameter(s). 


Definition I.1. A combinatorial class, or simply a class, is a finite or denumerable set 
on which a size function is defined, satisfying the following conditions: 


(i) the size of an element is a nonnegative integer; 
(it) the number of elements of any given size is finite. 


If A is a class, the size of an element a € A is denoted by |a], or |a|_4 in the 
few cases where the underlying class needs to be made explicit. Given a class A, 
we consistently let A, be the set of objects in A that have size n and use the same 
group of letters for the counts A,, = card(A,,) (alternatively, also a, = card(A,,)). 
An axiomatic presentation is then as follows: a combinatorial class is a pair (A, | - |) 
where A is at most denumerable and the mapping | - | € (A +> N) is such that the 
inverse image of any integer is finite. 


Definition I.2. The counting sequence of a combinatorial class is the sequence of 
integers (An)n>o where A, = card(A,) is the number of objects in class A that 
have size n. 


EXAMPLEI.1. Binary words. Consider first the set WV of binary words, which are words over 
the binary alphabet A = {0,1}, 


W := {e, 0, 1, 00, 01, 10, 11, 000, 001, 010,..., 1001101,...}, 


with e the empty word. Define size to be the number of letters a word comprises. There are 
two possibilities for each letter and possibilities multiply, so that the counting sequence (W,,) 
satisfies 

Wr = 2”. 


(This sequence has a well-known legend associated with the invention of the game of chess: the 
inventor was promised by his king one grain of rice for the first square of the chessboard, two 
for the second, four for the third, and so on. The king naturally could not deliver the promised 
Do STAINS) enrdsh A aletetelals wget Detect te Ae Ss unmet dts chet tlaca a END OF EXAMPLEI.1. 
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EXAMPLEI.2. Permutations. The set P of permutations is 
P ={...12, 21, 123, 132, 213, 231, 312, 321, 1234, ..., 532614, ...}, 


since a permutation of J, := [1..n] is a bijective mapping that is representable by an array, 


1 2 n 
OL 02 wae On ? 


or equivalently by the sequence o102---on of distinct elements from Z7,,. Let us define the 
size of a permutation to be its length, n. For a permutation written as a sequence of n distinct 
numbers, there are n places where one can accommodate n, then n — 1 remaining places for 
n — 1, and so on. Therefore, the number P,, of permutations of size n satisfies 


P,=nl=1-2---n. 
As indicated in our Invitation chapter, this formula has been known for a long time: Knuth [307, 


p. 23] refers to the Hebrew Book of Creation (c. A.D.. 400), and to the Anuyogadvarasutra 
(India, c. A.D. 500) for its discovery. ........... 0. eee eee eee eee END OF EXAMPLE I.2. 


EXAMPLE I.3. Triangulations. The class T of triangulations comprises triangulations of 
convex polygonal domains which are decompositions into non-overlapping triangles (taken up 
to continuous deformations of the plane). Let us define the size of a triangulation to be the 
number of triangles it is composed of. For the purpose of the present discussion, the reader may 
content herself with what is suggested by Figure 1; the formal specification of triangulations 
appears on p. 33. It is a nontrivial combinatorial result due to Euler and Segner around 1750 
that the number 77, of triangulations is 


ol 2n\ (Qn)! 
() m= (™) = ba. 


Following Euler [156], the counting of triangulations (T;,) is best approached by generat- 


ing functions: the modified binomial coefficients so obtained are known as Catalan num- 
bers (see the discussion p. 33) and are central in combinatorial analysis (Section I. 5.3). 
END OF EXAMPLEI.3. 


Although the previous three examples are simple enough, it is generally a good 
idea, when confronted with a combinatorial enumeration problem, to determine the 
initial values of counting sequences, either by hand or better with the help of a com- 
puter, somehow. Here, we find: 


3. «4 5 6 7 8 9 10 
8 16 32 64 128 256 512 1024 
6 24 120 720 5040 40320 362880 3628800 
5 14 42 132 429 = 1430 4862 16796 


(2) 
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Such an experimental approach may greatly help identify sequences. For instance, had 
we not known the formula (1) for triangulations, observing an unusual factorization 
like 

Tio = 27-5 +77 + 11-23-43 -47-53-59-61-67-71- 73-79, 


which contains all prime numbers from 43 to 79, would quickly put us on the tracks 
of the right formula. There even exists nowadays a huge Encyclopedia of Integer 
Sequences due to Sloane that is available in electronic form [439] (see also an earlier 
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book by Sloane and Plouffe [440]). Indeed, the three sequences (W,,), (P,,), and (Tp) 
are respectively identified! as EJS A000079, EJS A000142, and EJS 4000108. 


> L1. Necklaces. How many different types of necklace designs can you form with n beads, 
each having one of two colours, o and e? Here are the possibilities for n = 1, 2, 3, 


OOOQQOAOQ 


and it is postulated that orientation matters. This is equivalent to enumerating circular arrange- 
ments of two letters and an exhaustive listing program can be based on the smallest lexicograph- 
ical representation of each word, as suggested by (17) below. The counting sequence starts as 
2,3, 4, 6, 8, 14, 20, 36, 60, 108, 188, 352 and constitutes E7S A000031. [An explicit formula 
appears later in this chapter (p. 60).] What if two necklace designs that are mirror images of 
one another are identified? dq 


> 1.2. Unimodal permutations. Such a permutation has exactly one local maximum. In other 
words it is of the form 01 --- on witho1 < 02 <--+ << op =nandox, => O41 >-*: > 
On, for some k > 1. How many such permutations are there of size n? For n = 5, the number 
is 16: the permutations are 12345, 12354, 12453, 12543, 13452, 13542, 14532 and 15432 and 
their reversals. [Due to Jon Perry, see EJS A000079.] J 

It is also of interest to note that words and permutations could be enumerated 
using the most elementary counting principles, namely, for finite sets 6 and C 


card(BUC) =  card(B) + card(C) (provided BNC = 0) 
card(B xC) =  card(B)-card(C), 


(3) 


We shall see soon that these principles, which lie at the basis of our very concept of 
number, admit a powerful generalization (Equation (16) below). 

Next, for combinatorial enumeration purposes, it proves convenient to identify 
combinatorial classes that are merely variant of one another. 


Definition I.3. Two combinatorial classes A and B are said to be (combinatorially) 
isomorphic, which is written A = B, iff their counting sequences are identical. This 
condition is equivalent to the existence of a bijection from A to B that preserves size, 
and one also says that A and B are bijectively equivalent. 


We normally identify isomorphic classes and accordingly employ a plain equality 
sign (A = 6). We then confine the notation A & BG to stress cases where combinato- 
rial isomorphism results some nontrivial transformation. 


Definition I.4. The ordinary generating function (OGF) of a sequence (A) is the 
formal power series 


(4) A(z) = > Anz. 
n=0 


The ordinary generating function (OGF) of a combinatorial class A is the generating 
function of the numbers A, = card(A,,). Equivalently, the OGF of class A admits 


! Throughout this book, a reference like E/S Axxx points to Sloane’s Encyclopedia of Integer Se- 
quences [439]. The data base contains more than 100,000 entries. 
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FIGURE I.1. The class T of all triangulations of regular polygons (with size defined as 
the number of triangles) is a combinatorial class. The counting sequence starts as 


T =1, T; =1, To = 2, T3 =5, Tr = 14, Ts = 42. 


1-VJ1—4z 
22 


ex These numbers are known as the Catalan numbers (p. 33). 


Euler determined the OGF T(z) = 7, Tnz” as T(z) = 


a 
n+1 


, from which there 


results that T;, = 


the combinatorial form 


(5) Agia Sy 2!, 


acA 


It is also said that the variable z marks size in the generating function. 


The combinatorial form of an OGF in (5) results straightforwardly from observing 


that the term z” occurs as many times as there are objects in A having size n. 


Naming convention. We adhere to a systematic naming convention: classes, their 


counting sequences, and their generating functions are systematically denoted by the 


same groups of letters: for instance, A for a class, {An} (or {a,}) for the counting 


sequence, and A(z) (or a(z)) for its OGF. 


Coefficient extraction. We let generally [z”] f(z) denote the operation of extract- 


ing the coefficient of z” in the formal power series f(z) = >> fnz”, so that 


(6) Pe ieee) Sores 


n>0 


(The coefficient extractor [z”] f(z) reads as “coefficient of z” in f(z)”.) 
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FIGURE I.2. A molecule, methylpyrrolidinyl-pyridine (nicotine), is a complex assem- 
bly whose description can be reduced to a single formula corresponding here to a total of 
26 atoms. 


The OGFs corresponding to our three examples W,P, 7 are then 


=~ nr nr 1 
W(z) = me = aes 
(7) P(z) = So nl2” 
n=0 
a ee 27 a Poe 
fy ee TE 
ea aaes n 22 


The first expression relative to W (z) is immediate as it is the sum of a geometric pro- 
gression; The second generating function P(z) is not related to simple functions of 
analysis. (Note that the expression makes sense within the strict framewok of formal 
power series; see APPENDIX A: Formal power series, p. 676.) The third expression 
relative to T(z) is equivalent to the explicit form of T,, via Netwon’s expansion of 
(1 + «)'/? (p. 33). The OGFs W(z) and T(z) can then also be interpreted as stan- 
dard analytic objects, upon assigning to the formal variable z values in the complex 
domain C. In effect, the series W(z) and T(z) converge in a neighbourhood of 0 
and represent complex functions that are well defined near the origin, namely when 
|z| < 4 for W(z) and |z| < + for T(z). The OGF P(z) is a purely formal power 
series (its radius of convergence is 0) that can nonetheless be subjected to the usual 
algebraic operations of power series. As a matter of fact, with very few exceptions, 
permutation enumeration is most conveniently approached by exponential generating 
functions developed in Chapter IT. 


Combinatorial form of GFs. The combinatorial form (5) shows that generating 
functions are nothing but a reduced representation of the combinatorial class, where 
internal structures are destroyed and elements contributing to size (atoms) are replaced 
by the variable z. In a sense, this is analogous to what chemists do by writing linear 
reduced formulae for complex molecules (Figure 2). Great use of this observation was 
made by Schiitzenberger as early as the 1950’s and 1960’s. It explains in many ways 
why so many formal similarities are to be found between combinatorial structures and 
generating functions. 
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FIGURE I.3. A finite family of graphs and its eventual reduction to a generating function. 


Figure 3 provides a combinatorial illustration: start with a (finite) family of graphs 
H, with size taken as the number of vertices. Each vertex in each graph is replaced 
by the variable z and the graph structure is “forgotten”; then the monomials corre- 
sponding to each graph are formed and the generating function is finally obtained 
by gathering all the monomials. For instance, there are 3 graphs of size 4 in H, 
in agreement with the fact that [z4]H(z) = 3. If size had been instead defined by 
number of edges, another generating function would have resulted, namely, with y 
marking the new size: 1 + y + y? + 2y°? + y* + y®. If both number of vertices 
and number of edges are of interest, then a bivariate generating function, H(z, y) = 
zt 22y + 23y? + 23y3 + 24y3 + 24y4 + 24y®; such multivariate generating functions 
are developed systematically in Chapter III. 


A path often taken in the literature is to decompose the structures to be enu- 
merated into smaller structures either of the same type or of simpler types, and then 
extract from such a decomposition recurrence relations satisfied by the {A,,}. In this 
context, the recurrence relations are either solved directly—whenever they are simple 
enough—or by means of ad hoc generating functions, introduced as a mere technical 
artifice. 

By contrast, in the framework to be described, classes of combinatorial structures 
are built directly in terms of simpler classes by means of a collection of elementary 
combinatorial constructions. (This closely resembles the description of formal lan- 
guages by means of grammars, as well as the construction of structured data types in 
programming languages.) The approach developed here has been termed symbolic, as 
it relies on a formal specification language for combinatorial structures. Specifically, 
it is based on so-called admissible constructions that admit direct translations into 
generating functions. 


Definition 1.5. Assume that ® is a construction that associates to a finite collection 
of classes B,C,+-- anew class 


A := ®[B,C,...], 


in a finitary way: each A,, depends on finitely many of the {B;},{C;},..... Then 
® is admissible iff the counting sequence {A,,} of A only depends on the counting 
sequences {B;},{Cj},... of B,C, ... and for some operator = on sequences: 


{An} = SUB th tCit, 
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In that case, since generating functions are determined by their coefficient se- 
quences, there exists a well defined operator W translating = on the associated ordinary 
generating functions 


A(z) = U[B(z),C(z),...]. 
As an introductory example, take the construction of cartesian product. 


Definition 1.6. The cartesian product construction of two classes B and C forms or- 
dered pairs, 


(8) A=BxC iff A={a=(8,7)|BEB, VEC}, 
with the size of a pair a = (3,7) being defined by 
(9) lala = [la + IyIe- 


By considering all possibilities, it is immediately seen that the counting sequences 
corresponding to A, B,C are related by the convolution relation 


(10) An = 3 BaCh-k- 
k=0 


We recognize here the formula for a product of two power series. Therefore, 
(11) A(z) = B(z)-C(z). 


Thus, the cartesian product is admissible: A cartesian product translates as a product 
of OGFs. 
Similarly, let A,B,C be combinatorial classes satisfying 


(12) A=BUC, with BNC=0, 


with size defined in a consistent manner: for w € A, 


lwjp ifweB 


(13) lula = 
lwle ifwec. 
One has 
(14) A, = B,+Cy, 
which, at generating function level, means 
(15) A(z) = B(z)+ C(z). 
Thus, a union of sets translates as a sum of generating functions provided the sets are 
disjoint. 


The correspondences provided by (8)—(11) and (12)—(15) are summarized by the 
dictionary 


A=BUC = A(z) = B(z)+Cl(z) (provided BNC = 0) 
A=BxC = A(z) = B(z)-C(z) 


(Compare with the plain arithmetic case of (3).) Their merit is that they can be stated 
as general-purpose translation rules that only need to be established once and for all. 


(16) 
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As soon as the problem of counting elements of a union of disjoint sets or a cartesian 
product is recognized, it becomes possible to dispense altogether with the intermediate 
stages of writing explicitly coefficient relations or recurrences like in (10) or (14). This 
is the spirit of the symbolic method for combinatorial enumerations. Its interest lies 
in the fact that several powerful set-theoretic constructions are amenable to such a 
treatment. 


I.2. Admissible constructions and specifications 


The main goal of this section is to introduce formally the basic constructions that 
constitute the core of a specification language for combinatorial structures. This core 
is based on disjoint unions, also known as combinatorial sums, and on Cartesian prod- 
ucts that we have just discussed. We shall augment it by the constructions of sequence, 
cycle, multiset, and powerset. A class is constructible or specifiable if it can be de- 
fined from primal elements by means of these constructions. The generating function 
of any such class satisfies functional equations that can be transcribed systematically 
from a specification; see Theorems I.1 and I.2, as well as Figure 14 at the end of this 
chapter for a summary. 


I. 2.1. Basic constructions. First, we assume given a class € called the neutral 
class that consists of a single object of size 0; any such object of size 0 is called a 
neutral object and is usually denoted by symbols like € or 1. The reason for this 
terminology becomes clear if one considers the combinatorial isomorphism 


AZFExAZAXE. 


We also assume as given an atomic class Z comprising a single element of size 1; 
any such element is called an atom; an atom may be used to describe a generic node 
in a tree or graph, in which case it may be represented by a circle (e or 0), but also a 
generic letter in a word, in which case it may be instantiated as a, b,c,.... Distinct 
copies of the neutral or atomic class may also be subscripted by indices in various 
ways. Thus, for instance we use the classes Z, = {a}, Z = {b} (with a, b of size 1) 
to build up binary words over the alphabet {a,b}, or Z. = {e}, Z = {o} (with e,o 
taken to be of size 1) to build trees with nodes of two colours. Similarly, we introduce 
Eq, €1, €2 to denote a class comprising the neutral objects 0, €1, €2 respectively. 

Clearly, the generating functions of a neutral class € and an atomic class Z are 


E(z) =1, Z(z) =z, 


corresponding to the unit 1, and the variable z, of generating functions. 

Combinatorial sum (disjoint union). First consider combinatorial sum also known 
as disjoint union. The intent is to capture the union of disjoint sets, but without the 
constraint of any extraneous condition of disjointness. We formalize the (combina- 
torial) sum of two classes B and C as the union (in the standard set-theoretic sense) 
of two disjoint copies, say B" and C°, of B and C. A picturesque way to view the 
construction is as follows: first choose two distinct colours and repaint the elements of 
B with the O-colour and the elements of C with the ©-colour. This is made precise by 
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introducing two distinct “markers” 0 and ©, each a neutral object (i.e., of size zero); 
the disjoint union 6 + C of B,C is then defined as the standard set-theoretic union, 


B+C:=({O} x B)U({O} x C). 


The size of an object in a disjoint union .A = 6 + C is by definition inherited from its 
size in its class of origin, like in Equation (13). One good reason behind the defini- 
tion adopted here is that the combinatorial sum of two classes is always well-defined. 
Furthermore, disjoint union is equivalent to a standard union whenever it is applied to 
disjoint sets. 

Because of disjointness, one has the implication 


A=B+C = An=Brn+Cr = A(z) = Bz) +Clz), 


so that disjoint union is admissible. Note that, in contrast, standard set-theoretic union 
is not an admissible construction since 


card(B,, UC,) = card(B,) + card(C,) — card(Bn NC), 


and information on the internal structure of 6 and C (i.e., the nature of this intersec- 
tion) is needed in order to be able to enumerate the elements of their union. 

Cartesian product. This construction A = 6 x C forms all possible ordered pairs 
in accordance with Definition I.6. The size of a pair is obtained additively from the 
size of components in accordance with (9). 


Next, we introduce a few fundamental constructions that build upon set-theoretic 
union and product, and form sequences, sets, and cycles. These powerful construc- 
tions suffice to define a broad variety of combinatorial structures. 

Sequence construction. If C is a class then the sequence class SEQ(C) is defined 
as the infinite sum 


SEQ(C) = fe} +C+(C xC)+(CxXCXxC)+-:: 


with € being a neutral structure (of size 0). (The neutral structure in this context 
plays a réle similar to that of the “empty” word in formal language theory, while 
the sequence construction is somewhat analogous to the Kleene star operation (‘*’); 
see APPENDIX A: Regular languages, p. 678.) It is then readily checked that the 
construction .A = SEQ(C) defines a proper class satisfying the finiteness condition for 
sizes if and only if C contains no object of size 0. From the definition of size for sums 
and products, there results that the size of a sequence is to be taken as the sum of the 
sizes of its components: 


Y= (01,---,02) => — |yl=lorl+---+ lee. 


Cycle construction. Sequences taken up to a circular shift of their components 
define cycles, the notation being Cyc(B). Precisely, one has 


Cyc(B) := SEQ(B)/S, 
where S is the equivalence relation between sequences defined by 
(Q1,...,Qr)S (G1,..-, Br) 


iff there exists some circular shift 7 of [1 ..r] such that for all 7, 6; = a, ;); in other 
words, for some d, one has 3; = Q14(j+d) mod r+ Here is for instance a depiction of 
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the cycles formed from the 8 and 16 sequences of lengths 3 and 4 over two types of 
objects (a, b): the number of cycles is 4 (for n = 3) and 6 (for n = 4). Sequences are 
grouped into equivalence classes according to the relation S. 


aaaa 
aaa aaab aaba abaa baaa 
aab aba baa aabb abba bbaa baab 


(17) abb bba bab abab baba 
bbb abbb bbba bbab babb 
bbbb 
According to the definition, this construction corresponds to the formation of directed 
cycles. We make only a limited use of it for unlabelled objects; however, its counter- 
part plays a rather important réle in the context of labelled structures and exponential 
generating functions. 

Multiset construction. Following common mathematical terminology, multisets 
are like finite sets (that is the order between element does not count), but arbitrary 
repetitions of elements are allowed. The notation is 4 = MSET(B) when A is ob- 
tained by forming all finite multisets of elements from 8. The precise way of defining 
MSET(B) is as a quotient: 


MSET(B) := SEQ(B)/R_ with R, 


the equivalence relation of sequences being defined by (aj,...,a,) R(f,..., Br) 
iff there exists some arbitrary permutation o of [1..r| such that for all 7, 8; = a,j). 

Powerset construction. The powerset class (or set class) A = PSET(B) is de- 
fined as the class consisting of all finite subsets of class B, or equivalently, as the class 
PSeET(B) C MSET(B) formed of multisets that involve no repetitions. 


We again need to make explicit the way the size function is defined when such 
constructions are performed: like for products and sequences, the size of a composite 
object—set, multiset, or cycle—is defined as the sum of the sizes of its components. 


> 1.3. The semi-ring of combinatorial classes. Under the convention of identifying isomor- 
phic classes, sum and product acquire pleasant algebraic properties: combinatorial sums and 
cartesian products become commutative and associative operations, e.g., 


(A+B)+C=A+4+(B+C), Ax (BxC)=(Ax B)xC, 


while distributivity holds, (A+B) xC = (AxC)+(BxC). The proofs are simple verifications 
from the definitions. <q 


> L4. Natural numbers. Let Z := {e} with e an atom (of size 1). Then J = SEQ(Z) \ 
{e} is a way of describing natural integers in unary notation: J = {e, ee, eee,...}. The 
corresponding OGF is I(z) = z/(l1—z) =z+2?74+234---. << 


> 1.5. Interval coverings. Let Z := {e} be as before. Then A = Z + (Z x Z) isa set of 
two elements, @ and (e, e), which we choose to draw as {e, e—e}. Then C = SEQ(.A) contains 
elements like 


©, © 0, 0-0, 00-0, 0 00, 0 00-0, ec eee. 
With the notion of size adopted, the objects of size n inC = SEQ(Z + (Z x Z)) are (isomor- 


phic to) the coverings of the interval [0,n] by intervals (matches) of length either 1 or 2. The 
generating function, 


C(z) =14+ 242274329 4524482941329 4 212743429 455294..., 
is, as we shall see shortly (p. 40), the OGF of Fibonacci numbers. dq 
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I. 2.2. The admissibility theorem for ordinary generating functions. This sec- 
tion is a formal treatment of admissibility proofs for the constructions we have consid- 
ered. The final implication is that any specification of a constructible class translates 
directly into generating function equations. The cycle construction involves the Eu- 
ler totient function y(k) defined as the number of integers in [1, k] that are relatively 
prime to k (APPENDIX A: Arithmetical functions, p. 667). 


Theorem I.1 (Admissible unlabelled constructions). The constructions of union, carte- 
sian product, sequence, multiset, powerset, and cycle are all admissible. The associ- 
ated operators are 


Sum: A=B+C 
Product: A=BxC 


Sequence: A= SEQ(B) 


! dud 


Cycle: A = Cyc(B) 


k= 
Multiset: A=MSET(B) => A(z)= | n>1 


Powerset! A=PSET(B) => A(z)=¢ "*') x (1) 
exp (> CO") 
k=1 


The sequence, cycle, and set translations necessitate that By = 9. 


The class € = {e} consisting of the neutral object only, and the class Z consisting of 
a single “atomic” object (node, letter) of size 1 have OGFs 


E(z)=1 and Z(z) =z. 
PROOF. The proof proceeds by cases, building upon what we have just seen regarding 
unions and products. 
Combinatorial sum (disjoint union). Let A = B +C. Since the union is disjoint, 
and the size of an A-element coincides with its size in B or C, one has A, = By, +C), 


and A(z) = B(z) + C(z), as discussed earlier. The rule also follows directly from 
the combinatorial form of generating functions as expressed by (5): 


A(z) = mae" = de" + ae" = B(z)+ C(z). 


Cartesian Product. The admissibility result for 4 = B x C was considered as 
an example for Definition 1.6, the convolution equation (10) leading to the relation 
A(z) = B(z)-C(z). We can offer a direct derivation based on the combinatorial form 
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of generating functions (5), 


A(z) = Se zie] — Se 7l6l+lyl — S- 7Zll ) x se = B(z)-C(z), 


acA (B,y)€(BxC) BEB YEC 
as follows from distributing products over sums. This derivation readily extends to an 
arbitrary number of factors. 
Sequence construction. Admissibility for A = SEQ(B) (with By = 9) follows 
from the union and product relations. One has 


A={e}+B+(Bx B)+(Bx Bx B)+---, 


so that 
1 
~ 1 = Bz)’ 
where the geometric sum converges in the sense of formal power series since [z°] B(z) = 
0, by assumption. 
Powerset construction. Let A = PSET(B) and first take G to be finite. Then, the 
class A of all the finite subsets of GB is isomorphic to a product, 


PSET(B) = | J ({e} + {3}) 


BEB 


A(z) =1+ B(z) + B(z)? + B(z+--: 


with € a neutral structure of size 0. Indeed, distributing the products in all possible 
ways forms all the possible combinations, 1.e., sets, of elements of B with no repetition 
allowed, by reasoning similar to what leads to such an identity as 


(1+a)(1+6)(14+c) =14 [a+6+4+c]+ [ab + be + ac} + abc, 


where all combinations of variables appear. Then, directly from the combinatorial 
form of generating functions and the sum and product rules, we find 


A(z) = T] +2!) = [as 27). 
BEB n 
The exp-log transformation A(z) = exp(log A(z)) then yields 


A(z) = exp e2 By, log(1 + 2")) 
n=1 
Co Co nk 
(18) = ( Buy GA mie) 
a a | ( ; 
Bz Bz Bz 
= exp( A 9 +A ....), 
where the second line results from expanding the logarithm, 
uw ww 
log(1 SS ney Nh 
og(1 + wu) Te Sas 3 ; 


and the third line results from exchanging the order of summation. 
The proof finally extends to the case of 6 being infinite by noting that each A, 
depends only on those B; for which 7 < n, to which the relations given above for 
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the finite case apply. Precisely, let B'S” = S~”_, B; and A‘S™ = PSET(BIS™). 

Then, with O(z™*') denoting any series that has no term of degree < m, one has 
A(z) = AS™(z)+O(z™1) and = B(z) = BE™(z) + O(z™*), 

On the other hand, A($™(z) and B(S™ (z) are connected by the fundamental expo- 

nential relation (18) , since B'S”) is finite. Letting m tend to infinity, there follows in 


the limit (2) (22) (23) 
Bz Bz Bz 
A(z) - pe NED ge NOE eyes) 
(z) exp( ; 5 + 3 ) 
(See APPENDIX A: Formal power series, p. 676 for the definition of formal conver- 
gence.) 


Mutltiset construction. First for finite B (with Bp = @), the multiset class A = 
MSET(B) is definable by 


MSer(B) ~ [| SEQ({5}). 
BcB 
In words, any multiset can be sorted, in which case it can be viewed as formed of a 
sequence of repeated elements (,, followed by a sequence of repeated elements (2, 
where /3;, G2,... is a canonical listing of the elements of 6. The relation translates 
into generating functions by the product and sequence rules, 


CO 


A(z) = [Jfa-z#)* = [[a-2)? 
BEB 7 n=1 
= exp bs By, log(1 — a) 
BG), BE) , BE) 


where the exponential form results from the exp-log transformation. The case of an 
infinite class B follows by a continuity argument analogous the one used for powersets. 

Cycle construction. The translation of the cycle relation A = Cyc(B) turns out 
to be 


ok 1 
Ate) = 3 SP os a 
k=1 


where p(k) is the Euler totient function. The first terms, with L;,(z) := log(1 — 
B(z*))~? are 

1 1 2 2 4 2 
We defer the proof to APPENDIX A: Cycle construction, p. 674, since it relies in part 
on multivariate generating functions to be officially introduced in Chapter III. 


The results for sets, multisets, and cycles are particular cases of the well known 
Polya theory that deals more generally with the enumeration of objects under group 
symmetry actions [395, 397]. This theory is exposed in many textbooks, see for in- 
stance [98, 259]. The approach adopted here consists in considering simultaneously 
all possible values of the number of components by means of bivariate generating 
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functions. Powerful generalizations within the theory of species are presented in the 
book by Bergeron, Labelle, and Leroux [39]. 
> 16. Vallée’s identity. Let M = MSET(C), P = PSET(C). Separating elements of C 
according to the parity of the number of times they appear in a multiset gives rise to the identity 
M(z) = P(z)M(z’). 
(Hint: a multiset contains elements of either odd or even multiplicity.) Accordingly, one can 
deduce the translation of powersets from the formula for multisets. Iterating the relation above 
yields M(z) = P(z)P(z?)P(z*)P(z°) --- , that is closely related to the binary representation 
of numbers and to Euler’s identity (p. 46). It is used for instance in Note 56 p. 83. dq 
Restricted constructions. In order to increase the descriptive power of the frame- 
work of constructions, we also want to allow restrictions on the number of components 
in sequences, sets, multisets, and cycles. Let R be a metasymbol representing any of 
SEQ, Cyc, MSET, PSET and let 2 be a predicate over the integers, then Re(A) will 
represent the class of objects constructed by & but with a number of components con- 
strained to satisfy 2. Then, the notations 


SEQ_, (or simply SEQ;), SEQ3%,; SEQi..% 


refer to sequences whose number of components are exactly k, larger than k, or in the 
interval 1. . k respectively and the same holds for other constructions. In particular, 


k times 


‘ > k k 
SEQ, (B) = Bx---xB=B*, — SEQs;(B) = 5B’ = B* x SEQ(B), 
j2k 

MSET;(B) := SEQ; (B)/R. 
Similarly, SEQoaq,; SEQeyen Will denote sequences with an odd or even number of 
components, and so on. 

Translations for such restricted constructions are available, as shown generally in 
Subsection I. 6.1. Suffice it to note for the moment that the construction. A = SEQ;(B) 
is really an abbreviation for a k-fold product, hence it admits the translation into OGFs 


(19) A = SEQ;(B) => A(z) = B(z)*. 


I. 2.3. Constructibility and combinatorial specifications. By composing basic 
constructions, we can build compact descriptions (specifications) of a broad variety of 
combinatorial classes. Since we restrict attention to admissible constructions, we can 
immediately derive OGFs for these classes. Put differently, the task of enumerating a 
combinatorial class is reduced to programming a specification for it in the language of 
admissible constructions. In this subsection, we first discuss the expressive power of 
the language of constructions, then summarize the symbolic method (for unlabelled 
classes and OGFs) by Theorem I.2. 

First, in the framework just introduced, the class of all binary words is described 
by 

W = SEQ(A) where A={a,b}¥Z2+2Z, 
the ground alphabet, comprises two elements (letters) of size 1. The size of a binary 
word then coincides with its length (the number of letters it contains). In other words, 
we start from basic atomic elements and build up words by forming freely all the 
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objects determined by the sequence construction. Such a combinatorial description of 
a class that only involves a composition of basic constructions applied to initial classes 
€, Z is said to be an iterative (or nonrecursive) specification. Other examples already 
encountered include binary necklaces (Note 1, p. 18) and the natural integers (Note 4, 
p. 25) respectively defined by 


N=Cyc(Z+2Z) and Z=SEQ3)(Z). 
From there, one can construct ever more complicated objects. For instance, 
P = MSET(Z) = MSET(SEQs;(Z)) 


means the class of multisets of natural integers, which is isomorphic to the class of 
integer partitions (see Section I. 3 below for a detailed discussion). As such examples 
demonstrate, a specification that is iterative can be represented as a single term built on 
€, Z and the constructions +, x, SEQ, Cyc, MSET, PSET. An iterative specification 
can be equivalently listed by naming some of the subterms (for instance partitions in 
terms of natural integers themselves defined as sequences of atoms). 


Semantics of recursion. We next turn our attention to recursive specifications, 
starting with trees (cf also APPENDIX A: Tree concepts, p. 681 for basic definitions). 
In graph theory, a tree is classically defined as an undirected graph that is connected 
and acyclic. Additionally, a tree is rooted if a particular vertex is distinguished to be 
the root. Computer scientists commonly make use of trees called plane that are rooted 
but also embedded in the plane, so that the ordering of subtrees attached to any node 
matters. Here, we will give the name of general plane trees to such rooted plane trees 
and call G their class, where size is the number of vertices; see, e.g., [434]. (The term 
“general” refers to the fact that all nodes degrees are allowed.) For instance, a general 
tree of size 16, drawn with the root on top, is: 


As a consequence of the definition, if one interchanges, say, the second and third 
root subtrees, then a different tree results—the original tree and its variant are not 
equivalent under a smooth dformation of the plane. (General trees are thus comparable 
to graphical renderings of genealogies where children are ordered by age.). Although 
we have introduced plane trees as 2-dimensional diagrams, it is obvious that any tree 
also admits a linear representation: a tree 7 with root ¢ and root subtrees 71,..., 7; 


(in that order) can be seen as the object ¢ [t+ ++ Tr f where the box encloses similar 


representations of subtrees. Typographically, a box [-] may be reduced to a matching 
pair of parentheses, ‘(-)’, and one gets in this way a linear description that illustrates 
the correspondence between trees viewed as plane diagrams and functional terms of 
mathematical logic and computer science. 

Trees are best described recursively. A tree is a root to which is attached a (possi- 
bly empty) sequence of trees. In other words, the class G of general trees is definable 
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by the recursive equation 


(20) G = Z x SEQ(G), 


66099 


where Z comprises a single atom written “e” and denoting a generic node. 

Although such recursive definitions are familiar to computer scientists, the speci- 
fication (20) may look dangerously circular to some. One way of making good sense 
of it is via an adaptation of the numerical technique of iteration. Start with G!°] = 0, 
the empty set, and define successively the classes 


For instance, G!!] = Z x SEQ(0) = {(e, €)} © {e} describes the tree of size 1, and 


gel = {e, oe], eee], ele ee],...} 
gil = {os (2) (5) ES], 


First, each G!/] is well-defined since it corresponds to a purely iterative specification. 
Next, we have the inclusion gull CG UG +1) (G (4] admits of a simple interpretation as 
the class of all trees of height < 7). We can therefore regard the complete class G as 
defined by the limit of the GUI, that is, G := U, GUI. 


> 1.7. Limes superior of classes. Let {All} be any increasing sequence of combinatorial 
classes, in the sense that A¥! c AY+4 and the notions of size are compatible. If A!) = 
U 5 Al is a combinatorial class (i.e., there are finitely many elements of size n, for each n), 
then the corresponding OGFs satisfy Al°°!(z) = lim;_,.. A(z) in the formal topology (AP- 
PENDIX A: Formal power series, p. 676). <q 


Definition I.7. A specification for an r-tuple A = (AM,..., A) of classes is a 
collection of r equations, 


AD = B(AM,..., AM) 
(21) Al?) = 2(AM, see AW) 
() — BAM Ar) 
A CAS aA) 


where each 3; denotes a term built from the A’s using the constructions of disjoint 
union, cartesian product, sequence, set, multiset, and cycle, as well as the initial 
classes E (neutral) and Z (atomic). 


We also say that the system is a specification of A“). A specification for a class of 
combinatorial structures is thus a sort of formal grammar defining that class. Formally, 
the system (21) is an iterative specification if it is strictly upper-triangular, that is, 
A(”) is defined solely in terms of initial classes Z, €; the definition of A{’—)) only 
involves A“), and so on; in that case, by back substitutions, it is apparent that for an 
iterative specification, A“) can be equivalently described by a single term involving 
only the initial classes and the basic constructors. Otherwise, the system is said to 
be recursive. In the latter case, the semantics of recursion is identical to the one 
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introduced in the case of trees: start with the “empty” vector of classes, A := 
(0,...,0), iterate AU+4 = ] All], and finally take the limit. 


Definition 1.8. A class of combinatorial structures is said to be constructible or speci- 
fiable iff it admits a (possibly recursive) specification in terms of sum, product, se- 
quence, set, multiset, and cycle constructions. 


At this stage, we have therefore defined a specification language for combina- 
torial structures which is some fragment of set theory with recursion added. Each 
constructible class has by virtue of Theorem I.1 an ordinary generating function for 
which defining equations can be produced systematically. In fact, it is even possible 
to use computer algebra systems in order to compute it automatically! See the article 
of Flajolet, Salvy, and Zimmermann [206] for the description of such a system. 


Theorem I.2 (Symbolic method, unlabelled case). The generating function of a con- 
structible class is a component of a system of generating function equations whose 
terms are built from 


1, a, +,x,Q, Exp, Exp, Log, 


where 
3a — So vlk) 1 
Qlf] = T-f? Log[f] = » ‘ log -—a 
oO gk 2! oe) ak 
Exp = exp > Ae , Exp[f] = exp ea 
k=1 k=1 


Polya operators. The operator Q translating sequences (SEQ) is classically known 
as the quasi-inverse. The operator Exp (multisets, MSET) is called the Pélya exponen- 
tial” and Exp (powersets, PSET) is the modified Pélya exponential. The operator Log 
is the Pélya logarithm. They are named after Polya who first developed the general 
enumerative theory of objects under permutation groups [39, 395, 397]. 


The statement of Theorem I.2 signifies that iterative classes have explicit generat- 
ing functions involving compositions of the basic operators only, while recursive struc- 
tures have OGFs that are accessible indirectly via systems of functional equations. As 
we See at various places in this chapter, the following classes are constructible: binary 
words, binary trees, general trees, integer partitions, integer compositions, nonplane 
trees, polynomials over finite fields, necklaces, and wheels. We conclude this section 
with a few examples. 


Binary words. The OGF of binary words, as seen already, can be obtained di- 
rectly from the iterative specification, 


W = SEQ(Z + Z) => W(z) = 


whence the expected result, W,, = 2”. 


It is a notable fact that, though the Polya operators look algebraically “difficult” to compute with, 
their treatment by complex asymptotic methods, as regards coefficient asymptotics, is comparatively “easy”. 
We shall see many examples in Chapters IV-VII (e.g., pp. 239, 453). 
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General trees. The recursive specification of general trees leads to an implicit 
definition of their OGF, 


z 


From this point on, basic algebra does the rest. First the original equation is equivalent 
(in the ring of formal power series) to G — G? — z = 0. Next, the quadratic equation 
is solvable by radicals, and one finds 


Giz) = 4(1-v1I—-4) 
= g472 427345244 14254 42 29 + 13227 4+ 429 28 +... 


1 f2n-—2\ , 
wey) : 


n>1 


(The conjugate root is to be discarded since it involves a term z~! as well as negative 


coefficients.) The expansion then results from Newton’s binomial expansion, 
a(a—1) 4 


a a 
(1+ 2) a a eg ae 


L 


5 and x = —4z. 


applied with a = 
The numbers 


(22) C,=-— ey = minGGE: CeO) fe == 


— ntiln n+1)!n! 2z 


are known as the Catalan numbers (E7S A000108) in the honour of Eugéne Catalan 
(1814-1894), a French and Belgian mathematician who developed many of their prop- 
erties. These numbers are so common in combinatorics that we have decided to use 


a roman font for denoting them (like “log”, “sin”, and so on). In summary, general 
trees are enumerated by Catalan numbers: 


1 /2n-—2 
Gy = Cris a( iF 


n\n-1 


For this reason the term Catalan tree is often employed as synonymous to “general 
(rooted unlabelled plane) tree’. 


Triangulations. Fix n + 2 points arranged in anticlockwise order on a circle and 
conventionally numbered from 0 to n + 1 (for instance the n + 2nd roots of unity). A 
triangulation is defined as a maximal decomposition of the convex n + 2-gon defined 
by the points into n triangles. Triangulations are taken here as abstract topological 
configurations defined up to continuous deformations of the plane. The size of the 
triangulation is the number of triangles, that is, n. Given a triangulation, we define 
its “root” as a triangle chosen in some conventional and unambiguous manner (e.g., at 
the start, the triangle that contains the two smallest labels). Then, a triangulation de- 
composes into its root triangle and two subtriangulations (that may well be “empty’’) 
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appearing on the left and right sides of the root triangle; the decomposition is illus- 
trated by the following diagram: 


The class T of all triangulations can be specified recursively as 
T = feb + (FTxVxT), 


provided that we consider a 2-gon (a diameter) as giving rise to an empty triangulation. 
Consequently, the OGF satisfies the equation T = 1 + zT? and 


T(z) = = (1- V1 — 42). 


As a result, triangulations are enumerated by Catalan numbers: 


1 2 
Tyr = Cy = ee 
n+1i\n 


This particular result goes back to Euler and Segner (1753), a century before Catalan; 
see Figure | for first values and p. 69 for related bijections. 


> 1.8. A bijection. Since both general trees and triangulations are enumerated by Catalan 
numbers, there must exist a size-preserving bijection between the two classes. Find one such 
bijection. [Hint: the construction of triangulations is evocative of binary trees, and binary trees 
are themselves in bijective correspondence with general trees; see APPENDIX A: Tree concepts, 


p. 681.] <J 


> 1.9. A variant specification of triangulations. Consider the class U of “nonempty” triangula- 
tions of the n-gon, that is, we exclude the 2-gon and the corresponding “empty” triangulation 
of size 0. Then, / = T \ {e} admits the specification 


U=V+t(VxUuy+UxV)tUxV xl) 


which also leads to the Catalan numbers via U = z(1 + U)? and U(z) = (1 — 22 — 
V1 — 4z)/(2z), so that U(z) = T(z) - 1. <q 


I.2.4. Exploiting generating functions and counting sequences. In this book 
we are going to see altogether more than a hundred applications of the symbolic 
method. Before engaging in technical developments, it is worth inserting a few com- 
ments on the way generating functions and counting sequences can be put to good use 
in order to solve combinatorial problems. 
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Explicit enumeration formulae. In a number of situations, generating functions 
are explicit and can be expanded in such a way that explicit formulae result for their 
coefficients. A prime example is the counting of general trees and of triangulations 
above, where the quadratic equation satisfied by an OGF is amenable to an explicit 
solution—the resulting OGF could then be expanded by means of Newton’s binomial 
theorem. Similarly, we derive later in this Chapter an explicit form for the number of 
integer compositions by means of the symbolic method and OGFs (the answer turns 
out to be simply 2”~+) and derive many explicit specializations. In this book, we 
assume as known the elementary techniques from basic calculus by which the Taylor 
expansion of an explicitly given function can be obtained. (Good references on such 
elementary aspects are Wilf’s Generatingfunctionology [496], Graham, Knuth, and 
Patashnik’s Concrete Mathematics [248], and our book [434].) 


Implicit enumeration formulae. In a number of cases, the generating functions 
obtained by the symbolic method are still in a sense explicit, but their form is such that 
their coefficients are not clearly reducible to a closed form. It is then still possible to 
obtain initial values of the corresponding counting sequence by means of a symbolic 
manipulation system. Also, from generating functions, it is possible to derive system- 
atically recurrences* that lead to a procedure for computing an arbitrary number of 
terms of the counting sequence in a reasonably efficient manner. A typical example of 
this situation is the OGF of integer partitions, 


Pe)= Ti 


m=1 


for which recurrences obtained from the OGF and associated to fast algorithms are 
given in Note 12 (p. 40) and Note 17 (p. 46). 


Asymptotic formulae. Such forms are our eventual goal as they allow for an easy 
interpretation and comparison of counting sequences. From a quick glance at the 
table of initial values of W,,, Py, T, given in Eq. (2), it is apparent that W,, grows 
more slowly than J;,, which itself grows more slowly than P,,. The classification 
of growth rates of counting sequences belongs properly to the asymptotic theory of 
combinatorial structures which neatly relates to the symbolic method via complex 
analysis. A thorough treatment of this part of the theory is presented in Chapters [V— 
VIII. Given the methods exposed there, it becomes possible to estimate asymptotically 
the coefficients of virtually any generating function, however complicated’, that is 
provided by the symbolic method. 

Here, we content ourselves with a few remarks based on elementary real analysis. 
(The basic notations are described in APPENDIX A: Asymptotic Notation, p. 668.) 
The sequence W,, = 2” grows exponentially and, in such an extreme simple case, the 
exact form coincides with the asymptotic form. The sequence P,, = n! must grow at a 
faster asymptotic regime. But how fast? The answer is provided by Stirling’s formula, 


3See [206, 216, 373] for such systematic approaches. 
4In a number of cases, asymptotic analysis even applies to situations where the generating function 
itself is not even explicit, but only accessible through a functional equation of sorts. 
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FIGURE 1.4. The — growth 
regimes of three sequences 
f(n) = 2",T,,n! (from bottom 
to top) rendered by a plot of 
logy f(m) versus n. 


an approximation to the factorial numbers due to the Scottish mathematician James 
Stirling (1692-1770): 


n 1 
(23) n! = (=) V20n (1 + o(=)) (n — +00). 
e n 
The ratios of the exact values to Stirling’s approximations 


ni 1 2 5 10 100 1,000 


ae : 1.084437 1.042207 1.016783 1.008365 1.000833 = 1.000083 
nre-n/2nn 
show an excellent quality of the asymptotic estimate: the error is only 8% for n = 1, 
less than 1% for n = 10, and less than | per thousand for any n greater than 100. 
Stirling’s formula in turn gives access to the asymptotic form of the Catalan num- 
bers, by means of a simple calculation: 


— 1 Qn)! 1 (2n)?"e7 2" /4an 
"nt 1 (nt)? n n2me-222an 
which simplifies to 
Ae 
(24) Ch Ss 
ms 


Thus, the growth of Catalan numbers is roughly comparable to an exponential, 4”, 
modulated by a subexponential factor, here 1/V7n°. A surprising consequence of this 
asymptotic estimate to the area of boolean function complexity appears in Example 16 
below. 

Altogether, the asymptotic number of general trees and triangulations is well sum- 
marized by a simple formula. Approximations become more and more accurate as n 
becomes large. Figure 4 illustrates the different growth regimes of our three reference 
sequences while Figure 5 exemplifies the quality of the approximation with subtler 
phenomena also apparent on the figures and well explained by asymptotic theory. 
Such asymptotic formulae then make comparison between the growth rates of se- 
quences easy. 
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n Ch Cx Cr/Cn 

1 1 2.25 2.25675 83341 91025 14779 23178 
10 16796 18707.89 1.11383 05127 5244589437 89064 
100 0.89651 - 1057 0.90661 - 105” 1.01126 32841 24540 52257 13957 
1000 0.20461 - 10598 0.20484 - 10598 1.00112 51328 1542 41647 01282 


10000 0.22453 - 1909015 0.22456 - 106015 1.00011 25013 28127 92913 51406 
100000 0.17805 - 109199 9.17805 - 109199 1.00001 12500 13281 25292 96322 
1000000 0.55303 - 1092051 9.55303 - 10892951 1.90000 11250 00132 81250 29296 


FIGURE I.5. The Catalan numbers C,,, their Stirling approximation C7}, = 4"/Van3, 
and the ratio Cy, / Cn. 


> 1.10. The complexity of coding. A company specialized in computer aided design has sold 
to you a scheme that (they claim) can encode any triangulation of size n > 100 using at most 
1.5n bits of storage. After reading these pages, what do you do? [Hint: sue them!] See also 
Note 22 for related coding arguments. dq 


> L.11. Experimental asymptotics. From the data of Figure 5, guess the value of C97 / C107 
and of C¥. 496 / Cs.196 to 25D. (See, e.g., [313] for related asymptotic expansions and [64] for 
similar properties.) dq 

The interplay between combinatorial structure and asymptotic structure is indeed 
the principal theme of this book. We shall see that a vast majority of the generating 
functions provided by the symbolic method, however complicated, eventually lead to 
similarly simple asymptotic estimates. 


I.3. Integer compositions and partitions 


This section and the next ones provide examples of counting via specifications in 
classical combinatorial domains. They illustrate the benefits of the symbolic method: 
generating functions are obtained with hardly any computation, and at the same time, 
many counting refinements follow from a basic combinatorial construction. The most 
direct applications described here relate to the additive decomposition of integers 
into summands with the classical combinatorial-arithmetic structures of partitions and 
compositions. The specifications are iterative and simply combine two levels of con- 
structions of type SEQ, MSET, CYC, PSET. 


I.3.1. Compositions and partitions. Our first examples have to do with decom- 
posing integers into sums. 


Definition I.9. A composition of an integer n is a sequence (11, %2,..., Ux) of inte- 
gers (for some k) such that 

N=¢%,+%.+---4+ 4p, “j;>1. 
A partition of an integer n is a sequence (41, X2,..., Xx) of integers (for some k) such 
that 


N=X+%o+-:-+2,z and > 6g > +++ > Kp. 
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FIGURE I.6. Graphical representations of compositions and partitions: (left) the com- 
position 1 +3+1+44+2-+3 = 14 with its “ragged-landscape” and “balls-and-bars” 
models; (right) the partition 8+ 8+6+5+4+4+4+4+4+2+41+41 = 48 with its staircase 
(Ferrers diagram) model. 


In both cases, the x;’s are called the summands or the parts and the quantity n is 
called the size of the composition or the partition. 


By representing summands in unary using small discs (“e’’), we can render graph- 
ically a composition by drawing bars between some of the balls; if we arrange sum- 
mands vertically, compositions appear as ragged-landscapes. In contrast, partitions 
appear as staircases, also known as Ferrers diagrams [98, p. 100]; see Figure 6. We 
let C and P denote the class of pacement all compositions and all partitions. Since a 
set can always be presented in sorted order, the difference between compositions and 
partitions lies in the fact that the order of summands does or does not matter. This is 
reflected by the use of a sequence construction (for C) against a multiset construction 
(for P). In this perspective, it proves convenient to regard 0 as obtained by the empty 
sequence of summands (& = 0), and we shall do so from now on. 

First, let J = {1,2,...} denote the combinatorial class of all integers at least 1 
(the summands), and let the size of each integer be its value. Then, the OGF of TZ is, 
as we know, 


(25) I(2)= > l= — 


n>1 


since J, = 1 for n > 1, corresponding to the fact that there is exactly one object in Z 
for each size n > 1. If integers are represented in unary, say by small balls, one has, 


(26) I ={1, 2, 3, ...}={o, 00, 008, ...} © SEQs1{o}, 
which is another way to view the equality I(z) = z/(1— z). 


Compositions. First, the specification of compositions as sequences admits, by 
Theorem I.1, a direct translation into OGF: 


1 
27 — tT aS ey 
(27) C = SEQ(Z) => C(z) i=1@) 
The collection of equations (25), (27) thus fully determines C'(z): 
1 l-<z 
a Ee 


= L4+z4+22% +4423 4+ 8244 1625 + 3226 +... . 
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0 1 1 

10 512 42 

20 524288 627 

30 536870912 5604 

40 549755813888 37338 

50 562949953421312 204226 

60 576460752303423488 966467 

70 5902958 10358705651712 4087968 

80 6044629098073 14587353088 15796476 

90 618970019642690137449562112 56634173 

100 6338253001 1411470074835 1602688 190569292 

110 6490371073 168534535663 12041152512 607163746 

120 —66461399789245793645 1903530140172288 1844349560 
130 68056473384 1876926926749214863536422912 5371315400 
140 — 69689828745408 1973172991 196020261297061888 15065878135 
150 = 713623846352979940529 142984724747568 191373312 40853235313 
160 —73075081866545 14591018424 16358141509827966271488 107438159466 
170 —748288838313422294120286634350736906063837462003712 274768617130 
180 — 766247770432944429179173513575154591809369561091801088 684957390936 
190 = 7846377169233350954794736779009583020127944305580043 14112 1667727404093 
200 803469022129495 13777098 104617058 1301261 10149689 1396417650688 3972999029388 
210 — 82275227866060302107748459 127867525249 13679328 1678993 1674304512 9275102575355 
220  —84249833334845749358334422 146936345855 1 160763204392890034487820288 21248279009367 
230 8627 18293348820473429344482784628 18155638862 15212983 193953 15527974912 47826239745920 


240 = 88342353238919216479 164875037 1459257913741948437809479060803 100646309888 105882246722733 
250 904625697 166532776746648320380374280 10367 17552003 16906558262375061821325312 230793554364681 


FIGURE I.7. Forn = 0,10, 20, ..., 250 (left), the number of compositions C;, (middle) 
and the number of partitions (right). The figure illustrates the difference in growth between 
Cn = 2”~1 and P, = ev”), 


From there, the counting problem for compositions is solved by a straightforward 
expansion of the OGF: one has 


C(z) = ys, 2rz™ | — te A gia 
n>0 n>0 
implying 
Go, He Co = 1. 

This agrees with basic combinatorics since a composition of n can be viewed as the 
placement of separation bars at a subset of the n — 1 existing places inbetween n 
aligned balls (the “balls and bars” model of Figure 6), of which there are clearly 2”~! 
possibilities. 

Partitions. For partitions specified as multisets, the general translation mecha- 
nism provides 


(28) P=MSET(Z) = P(z)=exp (2) + 5102) + aI) ++ 5 


with product form 


= (l+z+227+---)(Lt227t244---) (14234 2% 4---)-- 
= 1+ o4+ 227 + 329 + 524+ 72> + 112° + 1527 + 2228 4 


where the counting sequence is EJS A000041. Contrary to compositions that are 
counted by the explicit formula 2”~', no simple form exists for P,. Asymptotic 
analysis of the OGF (28) based on the saddle point method (Chapter VIII) shows that 
P,, = eV), In fact a very famous theorem of Hardy and Ramanujan later improved 
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by Rademacher (see Andrew’s book [10] and Chapter VIID) provides a full expansion 
of which the asymptotically dominant term is 


P 1 2n 
~ exp | 7/— |. 
" Anv/3 3 
There are consequently appreciably fewer partitions than compositions (Figure 7). 
> 1.12. A recurrence for the partition numbers. Logarithmic differentiation gives 


P'(z) nz” ; : . 
Py Le imwlving nF. = Dl oWPees 

where o(n) is the sum of the divisors of n (e.g., 7(6) = 1+ 2+3-+46 = 12). Consequently, 
P,,..., Px can be computed in O(N”) integer-arithmetic operations. (The technique is gener- 
ally applicable to powersets and multisets; see Note 40 for another application. Note 17 further 
lowers the bound in the case of partitions to O(N VN).) <q 

By varying (27) and (28), we can use the symbolic method to derive a number of 
counting results in a straightforward manner. First, we state: 


Proposition I.1. Let J C TZ be a subset of the positive integers. The OGF of the 
classes C7? := SEQ(SEQz7(Z)) and P7 := MSET(SEQ7(Z)) of compositions and 
partitions having summands restricted to T is given by 


OT (2) = +=, PM) = TJ — 


fae Late) Ce ee a 


PROOF. The statement results directly from Theorem I.1. 


This proposition permits us to enumerate compositions and partitions with re- 
stricted summands, as well as with a fixed number of parts. 


EXAMPLE I.4. Compositions with restricted summands. In order to enumerate the class C ie) 
of compositions of n whose parts are only allowed to be taken from the set {1, 2}, simply write 


CMa seo), witht = (i, 2. 


Thus, in terms of generating functions, one has 


it : 
oes) = Tray) with rit} (2) =z427. 
This formula implies 
ott} (2) => —= =1l+2z4 227 t 323 524 82° t 132° Peters 


and the number of compositions of n in this class is expressed by a Fibonacci number, 


] 


ee " 1- 
Cart = Fn+1 where F, = — a v5 — v5 ‘ 
V5 2 2 
; . ; > 1475. 

In particular, the rate of growth is of the exponential type y”, where y := 5 is the golden 
ratio. 

Similarly, compositions such that all their summands lie in the set {1,2,..., 7} have gen- 
erating function 

1 1 l-z 


{ges} = = ———————  — — — 
eC Oe ae ae ae 
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and the corresponding counts are given by generalized Fibonacci numbers. A double combina- 
torial sum expresses these counts 


ae See 2(l—2")\? el ji\{n—rk—-1 
ep aber a(S) =o (2) ( ee ) 
J jk 
This result is perhaps not too useful for grasping the rate of growth of the sequence when n gets 
large, so that asymptotic analysis is called for. Asymptotically, for any fixed r > 2, there is a 
unique root p, of the denominator 1 — 2z + z’*! in (3; 1), this root dominates all the other 
roots and is simple. Methods amply developed in Chapter IV, imply that, for some constant 


cr > 0, 
(31) Clheort ~ Crp, for fixed ras n — oo. 
The quantity p, plays a réle similar to that of the golden ratio when r = 2. END OF EXAMPLE I.4. 


> 1.13. Compositions into primes. The additive decomposition of integers into primes is still 
surrounded with mystery. For instance, it is not known whether every even number is the sum 
of two primes (Goldbach’s conjecture). However, the number of compositions of n into prime 
summands (any number of summands is permitted) is Bn = [z"]B(z) where 


=A 
2 3 5 7 11 = 
1-5 zP = "(0 Cy es hee ee vo) 
—p prime 


= 142724224 24432°422% +627 462941029 416214... 


Biz) 


(EIS A023360) and complex asymptotic methods make it easy from there to determine the 
asymptotic form B, ~ 0.30365 - 1.47622”; see Chapter IV. dq 


EXAMPLE I.5. Partitions with restricted summands (denumerants). Whenever summands are 
restricted to a finite set, the special partitions that result are called denumerants. A popular 
denumerant problem consists in finding the number of ways of giving change of 99 cents using 
coins that are pennies (1 ¢), nickels (5 ¢), dimes (10 ¢) and quarters (25 ¢). (The order in which 
the coins are taken does not matter and repetitions are allowed.) For the case of a finite T, we 
predict from Proposition I.1 that P? (z) is always a rational function with poles that are at roots 
of unity; also the P7 satisfy a linear recurrence related to the structure of T. The solution to 
the original coin change problem is found to be 


99 1 = 
2 IG ad 2) — 21 A 


In the same vein, one proves that 


pli} — ee uy pit2.3} gu ale 3)" | 
me 4 “ 12 : 
There [x] = |x + 3] denotes the integer closest to the real number x. Such results are 


typically obtained by the two step process: (1) decompose the rational generating function into 
simple fractions; (ii) compute the coefficients of each simple fraction and combine them to get 
the final result [98, p. 108]. 

The general argument also gives the generating function of partitions whose summands lie 
in the set {1,2,...,r} as 


(32) PEt (2) = II 1 


m=1 
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In other words, we are enumerating partitions according to the value of the largest summand. 
One then finds by looking at the poles (Chapter IV): 


1 


{die sary x r-1 : a 
(33) Pi Crn with c, = ae=1! = 


A similar argument provides the asymptotic form of P? when T is an arbitrary finite set: 


Pe pi with T := Il n, r:=card(T). 
‘i neT 


This result originally due to Schur is discussed in Chapter IV. .... END OF EXAMPLEL.5. 


We next examine compositions and partitions with a fixed number of summands. 


EXAMPLE 1.6. Compositions with a fixed number of parts. Let C (*) denote the class of 
compositions made of k summands, k a fixed integer > 1. One has 


c)(z) = SEQ,(Z) =I xIx---xZ, 


where the number of terms in the cartesian product is k. From there, the corresponding gener- 
ating function is found to be 


o®) = (1(z))* with = T(z) = 


The number of compositions of n having k parts is thus 


k eA zk n—-1 
Cn = [2 la ~ a) 


a result which constitutes a combinatorial refinement of C;, = 2”~1. (Note that the formula 
co = Gy also results easily from the balls-and-bars model of compositions (Figure 6)). 


k-1 
In such a case, the asymptotic estimate OP ce gitok /(kK — 1)! results immediately from the 
polynomial form of the binomial coefficient (Ga): sha watetates END OF EXAMPLE I.6. 


EXAMPLE I.7. Partitions with a fixed number of parts. Let P'S") be the class of integer 
partitions with at most k summands. With our notation for restricted constructions (p. 29), this 
class is specified as 


P‘S®) — MSET<;(Z). 


It would be possible to appeal to the admissibility of such restricted compositions as developed 
in Section I. 6.1, but the following direct argument suffices. 

Geometrically, partitions, are represented as collections of points: this is the staircase 
model of Figure 6). A symmetry around the main diagonal (also known in the specialized 
literature as conjugation) exchanges number of summands and value of largest summand: one 
has (with previous notations) 


Pls) w pth. - ky ae P{S™)(z) = pth F(z), 
so that, by (32), 


(34) P{S™)(z) = plleoks = Il a 
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As a consequence, the OGF of partitions with exactly k summands, P“)(z) = P‘S")(z) — 
P‘S*-)(z), evaluates to 


ok 


(1—2)(1— 27)---(1— 2%) 
Given the equivalence between number of parts and largest part in partitions, the asymptotic 
estimate (33) applies verbatim here. ...................0. 202 ee END OF EXAMPLE I.7. 


P®(z) = 


> 1.14. Compositions with summands bounded in number and size. The number of composi- 
tions of size n with k summands each at most r is 


2") (: aoe ) k | 


and is expressible as a simple binomial convolution. dq 


> 1.15. Partitions with summands bounded in number and size. The number of partitions of 
size n with at most k summands each at most £ is 


2") (l-z)a—- z2).s. (1- Zhth) 
G-a0-#)-*)- (0-2) =) 


(The verification by recurrence is easy.) The GF reduces to the binomial coefficient Cae 


k 
z — 1; it is known as a Gaussian binomial coefficient, denoted (3 or a “q-analogue” of 


the binomial coefficient [10, 98]. 

The last example of this section illustrates the close interplay between combi- 
natorial decompositions and special function identities, which constitutes a recurrent 
theme of classical combinatorial analysis. 


) as 


EXAMPLE I.8. The Durfee square of partitions and stack polyominoes. The diagram of any 
partition contains a uniquely determined square (known as the Durfee square) that is maximal, 
as exemplified by the following diagram: 


This decomposition is expressed in terms of partition GFs as 
p=lJ (2° x POSh) x pore) 
k>0 


It gives, via (32) and (34), the non-trivial identity 


a 1 Pal 
lize Gay dey 


(k is the size of the Durfee square), which is nothing but a formal rewriting of the geometric 
decomposition. 

Here is a similar case illustrating the direct correspondence between geometric diagrams 
and generating functions, as afforded by the symbolic method. 
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Spec. OGF coeff. asympt. 
Composition SEQ(SEQs1(Z)) — gn-l 52" 
3 a 
—, sum. < r SEQ(SEQ, ae r(Z)) T—224 2°t2 Eq. (30) Cr Pr 
k k-1 
z n—-1 n 
—, k sum. SEQ; (SEQ>1(Z)) (2 a (ki)! 


= 1 
Partitions MSET(SEQs1(Z 1—2™)7! — eves 
(See) 0-2") 4 
he r-1 
ae n 
—, sum. <r MSET(SEQ,..-(Z)) Tes re eri 
my—-1 not 
—,<ksum. =~ MSET(SEQ,__;(Z)) dT] tee ) Wk oI) 
2 
Cyclic comp. CYC(SEQs1(Z)) Eq. (35) Eq. (36) — 
a) 33/4 
Part., distinct sum. PSET(SEQs;(Z)) Ia +2") OIE ™V 3 


FIGURE I.8. Partitions and compositions: specifications, generating functions, counting 
sequences, and asymptotic approximation. 


Stack polyominoes are diagrams of compositions such that for some 7, 2, one has 1 < #1 < 
G2 <2) Sai > Xj41 [+++ S Le > 1 (see [447, §2.5] for further properties). The diagram 
representation of stack polyominoes, 


translates immediately into the OGF 


aa 1 
oe aaa a 


k>1 


once use is made of the partition GFs P Cnty, (z) of (32). This last relation provides a bona fide 
algorithm for computing the initial values of the number of stack polyominoes (EJS 4001523): 


S(z) = 242274423 4+8244152° 42725 +472" +7928 +.... 


The book of van Rensburg [482] describes many such constructions and their relation to certain 
models of statistical physics. 1.2.0.2... 0... c eee eee eee eee eee END OF EXAMPLEI.8. 


Figure 8 summarizes what has been learnt regarding compositions and parti- 
tions. The way several combinatorial problems are solved effortlessly by the symbolic 
method is worth noting. 
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I.3.2. Related constructions. It is also natural to consider the two constructions 
of cycle and powerset that we have not yet applied to Z. 


Cyclic compositions (wheels). The class D = CYC(I) comprises compositions 
defined up to circular shift of the summands; so, for instance 2+3+1+2+4 5, 
3+14+2+5-4 2, etc, are identified. Alternatively, we may view elements of D 
as “wheels” composed of circular arrangements of rows of balls (taken up to circular 
symmetry). 


o%e 
A “wheel” (cyclic composition): °°°°% see e 0 
o e¢ e@ 


By the cycle construction, the OGF is 


co —1 
vk) zk 
D = — 1 1- 
OS eae 
= 242274323 4524472941324 19274 3528 4+---. 
The coefficients are thus (ETS A008965) 


1 1 Qn 
ae pray Se ee 
(36) Dn == Y) olk)(2 1) = -1+ — $7 o(k)2 ; 


nm 
k|n k|n 


(35) 


Notice that D,, is of the same asymptotic order as 2Ch, which is suggested by circular 
symmetry of wheels, but D, ~ 2C;,/n. 

Partitions into distinct summands. The class Q = PSET(Z) is the subclass 
of P = MSET(Z) corresponding to partitions determined like in Definition 1.9, but 
with the strict inequalities x, > --- > x1, so that the OGF is 


Q(z) = ][a+2 S124 27 +227 + 22* + 32° 4+ 428 + 527 +625 +--., 
n>1 
The coefficients (ETS A000009) are not amenable to closed from. However the saddle 
point method (Chapter VII) yields the approximation: 


g°/4 n 
G7) Qn~ Sex (~/2) , 


which has a shape similar to that of P,,. 


>> 1.16. Odd versus distinct summands. The partitions of n into odd summands (Q,,) and into 
distinct summands (Q,,) are equinumerous. Indeed, one has 


co 


ae=][at), o@=]]a-2)7. 


j=0 
Equality results from substituting (1 + a) = (1 — a?)/(1 — a) witha = 2™, 
1-2? 1-24 1-2 1-—281-2z'° _ ol 1 1 
~ 1-2 1-2? 1-23 1-24 1-28 ~ 1-z1-2321-23 , 
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and simplification of the numerators with half of the denominators (in boldface). J 


Partitions into powers. Let T?°~ = {1,2,4,8,...} be the set of powers of 2. 
The corresponding P and Q partitions have OGFs 


Co 


1 
pow = 2 
(2) I= 
j=0 
= Ltz4+222 +223 + 4244 425 + 625 + 627 +1028 +4.--- 
gew(z) = []at+2”) 
7j=0 
= lteegtez?tett ett zest--:. 


The first sequence 1,1,2,2,... is the “binary partition sequence” (EJS A018819); 
the difficult asymptotic analysis was performed by de Bruijn [110] who obtained an 
estimate that involves subtle fluctuations and is of the global form eOllos*n) The 
function QP°”(z) reduces to (1 — z)~+ since every number has a unique additive 
decomposition into powers of 2. Accordingly, the identity 


1 — F 
ole) 
j=0 


first observed by Euler is sometimes nicknamed the “computer scientist’s identity” as 
it expresses the fact that every number admits a unique binary representation. 

There exists a rich set of identities satisfied by partition generating functions— 
this fact owes to deep connections with elliptic functions, modular forms, and q— 
analogues of special functions on the one hand, basic combinatorics and number the- 
ory on the other hand. See [10, 98] for an introduction to this fascinating subject. 
[> 1.17. Euler’s pentagonal number theorem. This famous identity expresses 1/P(z) as 


II (l-2")= Syren. 


n>1 kez 


It is proved formally and combinatorially in [98, p. 105]. As a consequence, the numbers 
{ Pj }Xo can be determined in O(N VN) arithmetic operations. <J 
> 1.18. A digital surprise. Define the constant 

9 99 999 9999 


“~~ 10 100 1000 10000 
Is it a surprise that it evaluates numerically to 


vy = 0.8900100999989990000001000099999999899999000000000010 - -- , 


that is, its decimal representation involves only the digits 0,1, 8, 9? [This is suggested by a note 
of S. Ramanujan, “Some definite integrals”, Messenger of Math. XLIV, 1915, pp. 10-18.] << 


> 1.19. Lattice points. The number of lattice points with integer coordinates that belong to the 
closed ball of radius n in d—dimensional Euclidean space is 

2, 1 
1-—z 


(@(z))* where O(z) =142 S> oe 


Such OGFs are useful in cryptography [321]. Estimates may be obtained from the saddle point 
method; see Chapter VIII. <q 


I. 4. WORDS AND REGULAR LANGUAGES 47 


I.4. Words and regular languages 


Fix a finite alphabet A whose elements are called /etters. Each letter is taken 
to have size 1, i.e., it is an atom. A word is then any finite sequence of letters, 
usually written without separators. So, for us, with the choice of the latin alphabet 
(A = {a,...,z}), sequences written as ygololihp, philology, zgrmblglps 
are words. We denote the set of all words (often written as A* in formal linguistics) 
by W. Following a well-established tradition in theoretical computer science and for- 
mal linguistics, any subset of WV is called a language (or formal language, when the 
distinction with natural languages has to be made). 

From the definition of the set of words W, one has 
(38) W & SEQ(A) = W(z) = so 

1—mz 
where m is the cardinality of the alphabet, i.e., the number of letters. The generating 
function gives us the counting result 


W, =m". 


This result is elementary, but, as is usual with symbolic methods, many enumerative 
consequences result from a given construction. It is precisely the purpose of this 
section to examine some of them. 


We shall introduce separately two frameworks that each have great expressive 
power to describe languages. The first one is iterative (i.e., nonrecursive) and it bases 
itself on “regular specifications” that only involve sums, products, and sequences; the 
other one that is recursive (but of a very simple form) is best conceived of in terms 
of finite automata and is equivalent to linear systems of equations. Both frameworks 
turn out to be logically equivalent in the sense that they determine the same family 
of languages, the regular languages, though the equivalence? is nontrivial and each 
particular problem usually admits a preferred representation. The resulting OGFs are 
invariably rational functions, a fact to be systematically exploited from an asymptotic 
standpoint in Chapters IV and V. 


1.4.1. Regular specifications. Consider words (or strings) over the binary al- 
phabet A = {a,b}. There is an alternative way to construct binary strings. It is 
based on the observation that (with a minor adjustment at the beginning) a string de- 
composes into a succession of “blocks” each formed with a single b followed by an 
arbitrary (possibly empty) sequence of a’s. For instance aaabaababaabbabbaaa de- 
composes as 

aaa || baa | ba | baa | b | ba | b | baaa. 
Omitting redundant® symbols, we have the alternative decomposition: 
1 1 
(39) W ©& SEQ(a) x SEQ(b SEQ(a)) == W(z)= 


— l-zl-2 


S APPENDIX A: Regular languages, p. 678 provides a basis for this equivalence. 
®As usual, when dealing with words, we freely omit redundant braces ‘{, }’ and cartesian products 
‘x’. For instance, SEQ(a + b) and ab are shorthand notations for SEQ({a} + {b}) and {{a} x {b}}. 
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This last expression reduces to (1 — 2z)~! as it should. 

Longest runs. The interest of the construction just seen is to take into account 
various meaningful properties, for example longest runs. Denote by a<* := SEQ-;(a) 
the collection of all words formed with the letter a only and whose length is between 
0 and k — 1; the corresponding OGF is 1+ z+---+ 2*-! = (1 — 2*)/(1— z). 
The collection W‘*) of words which do not have k consecutive a’s is described by an 
amended form of (39), and 

1 —2* 1 1-2" 


(k) — a<* SRo(ba<*) — WwW = = 
Wi'*) = a<* SEQ(ba<*) QM= 7-3 1— ie 1-22 


The OGF is in principle amenable to expansion, but the resulting coefficients expres- 
sions are complicated and, in such a case, asymptotic estimates tend to be more usable. 
From an analysis developed in Chapter V, it can indeed be deduced that the longest 
run of a’s ina random binary string of length n is asymptotic to log, n. 

> 1.20. Runs in arbitrary alphabets. For an alphabet of cardinality m, the quantity 

k 


1-—<z 
1—mz+(m-—1)z*+1 
is the OGF of words without k consecutive occurrences of a designated letter. dq 


The case of longest runs exemplifies the usefulness of nested constructions in- 
volving sequences. We set: 


Definition 1.10. An iterative specification that only involves atoms (e.g., letters of a 
finite alphabet A) together with combinatorial sums, cartesian products, and sequence 
constructions is said to be a regular specification. 

A language L is said to be S-regular (specification-regular) if there exists a class 
M described by a regular specification R such that Land M are combinatorially 
isomorphic: L= M. 

An equivalent way of expressing the definition is as follows: a language is S- 

regular if it can be described unambiguously by a regular expression (APPENDIX A: 
Regular languages, p. 678). The definition of a regular specification and the basic 
admissibility theorem imply immediately: 
Proposition I.2. Any S-regular language has an OGF that is a rational function. This 
OGF is obtained from a regular specification of the language by translating each letter 
into the variable z, disjoint unions into sums, cartesian products into products, and 
sequences into quasi-inverses, (1 —-)~}. 


This result is technically shallow but its importance derives from the fact that 
regular languages have great expressive power devolving from their rich closure prop- 
erties (APPENDIX A: Regular languages, p. 678) as well as their relation to finite 
automata discussed in the next subsection. Examples 9 and 10 make use of Proposi- 
tion I.2 and treat two problems closely related to longest runs. 


EXAMPLE I.9. Combinations and spacings. A regular specification describes the set £ of 
words that contain exactly k occurrences of the letter b, from which the OGF automatically 
derives: 


(40) L = SEQ(a) (bSEQ(a))* = L(z) = 2*/(1— z)**t. 
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Hence the number of words in the language satisfies L,, = (2). This is otherwise combinato- 
rially evident, since each word of length n is characterized by the positions of its letters b, that 
is, the choice of k positions amongst n possible ones. Symbolic methods thus give us back the 
well-known count of combinations by binomial coefficients. 

Let C eq be the number of combinations of k elements amongst [1, n] with constrained 
spacings: no element can be at distance d or more from its successor. The refinement of (40) 


£4 = se q(a) (bSEQc4(a))"* (6SEQ(a)) => =) cee a 


leads to a binomial convolution expression, 


Gee aces liew) 


(This problem is analogous to compositions with bounded summands.) What we have just 
analysed in the largest spacing (constrained to be < d) in subsets; a parallel analysis yields 


information regarding the smallest spacing. .................060- END OF EXAMPLE I.9. 


EXAMPLEI.10. Double run statistics. By forming maximal groups of equal letters in words, 
one finds easily that, for a binary alphabet, 


W = SEQ(b) SEQ(a SEQ(a) b SEQ(b)) SEQ(a). 


Let W‘%) be the class of all words that have at most a consecutive a’s and at most B 
consecutive b’s. The specification of W produces a specification of Wie), upon replacing 
SEQ(a), SEQ(b) by SEQ, (a), SEQ<g(b) internally, and by SEQ<, (a), SEQ<g(b) externally. 
In particular, the OGF of binary words that never have more than r consecutive equal letters is 
found to be (set a = GB =r) 


Laz Lt zt +2" 


41 ATT) ee ap we EO a Nae eee 
G1) me 1—-2z+27t1  J-—z—---—2P’ 


after simplification. 

Révész in [410] tells the following amusing story attributed to T. Varga: “A class of high 
school children is divided into two sections. In one of the sections, each child is given a coin 
which he throws two hundred times, recording the resulting head and tail sequence on a piece 
of paper. In the other section, the children do not receive coins, but are told instead that they 
should try to write down a ‘random’ head and tail sequence of length two hundred. Collecting 
these slips of paper, [a statistician] then tries to subdivide them into their original groups. Most 
of the time, he succeeds quite well.” 

The statistician’s secret is to determine the probability distribution of the maximum length 
of runs of consecutive letters in a random binary word of length n (here n = 200). The 
probability of this parameter to equal k is 


= (wie 7 wet) 


and is fully determined by (41). The probabilities are then easily computed using any symbolic 
package: For n = 200, the values found are 


k 3 4 5 6 7 8 9 10 11 12 
P: 6.5410-8 7.0710—4 0.0339 0.1660 0.2574 0.2235 0.1459 0.0829 0.0440 0.0226 
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Thus, in a randomly produced sequence of length 200, there are usually runs of length 7 or 
more: the probability of the event turns out to be close to 80% (and there is still a probability 
of about 8% to have arun of length 11 or more). On the other hand most children (and adults) 
are usually afraid of writing down runs longer than 4 or 5 as this is felt as strongly “non- 
random”. The statistician simply selects the slips that contain runs of length 6 or more. Et 
VOU Al: Sc scoeaa inserted ares Bones Leas eae hee oneal enceeleeies END OF EXAMPLEI.10. 


> 1.21. Alice and Bob. Alice wants to communicate n bits of information to Bob over a channel 
(a wire, an optic fiber) that transmits 0,1—bits but is such that any occurrence of 11 terminates 
the transmission. Thus, she can only send on the channel an encoded version of her message 
(where the code is of some length £ > n) that does not contain the pattern 11. 

Here is a first coding scheme: given the message m = m1mz2---mMn, where m; € {0, 1}, 
apply the substitution: 0 ++ 00 and 1 +> 10; terminate the transmission by sending 11. This 
scheme has £ = 2n + O(1), and we say its rate is 2. Can one design codes with rate arbitrarily 
close to 1, asymptotically? 

Let C be the class of allowed code words. A code of length at most L is achievable only 


if there is a one-to-one mapping from {0, 1}” into WES C55 1063.2" < eae C;. Working out 
the OGF of C, one finds that necessarily 


BEIMOG: “N= = 1440400, y= Ltv® 
logs p 2 
Thus no code can achieve a rate better than 1.44; i.e., a loss of at least 44% is unavoidable. (For 
this and the next note, see, e.g., MacKay [349, Ch. 17].) <J 


> 1.22. Coding without long runs. Because of hysteresis in magnetic heads, certain storage 
devices cannot store binary sequences that have more than 4 consecutive 0’s or more than 
4 consecutive 1’s. A coding scheme that transforms an arbitrary binary string into a string 
obeying this constraint is sought. 


From the OGF, one finds [z1!]W‘*(z) = 1546 > 2'° = 1024. Consequently, a 
substitution can be built that translates an original 10 bit block into an 11 bit block without 
five consecutive equal letters. When substituted blocks are concatenated, this may give rise to 
unwanted sequences of consecutive letters that are longer than acceptable. It then suffices to use 


“separators” and replace a substituted block of the form a- - - 3 by the longer block Ga--- GG, 


where 0 = 1 and I = 0. The resulting code has rate 7. 


Extensions of this method show that the rate 1.057 is achievable (theoretically). On the 
other hand, by the previous note, any acceptable code must use asymptotically at least 1.056n 
bits to encode strings of n bits. (Hint: let a be the root near 4 of 1 — 2a + a° = 0, which is a 


pole of W‘**), One has 1/log,(1/a) = 1.05621.) <J 


Patterns. There are many situations in the sciences where it is of interest to de- 
termine whether the appearance of a certain pattern in long sequences of observations 
is significant. In a genomic sequence of length 100,000 (the alphabet is A, G, C, T), 
is it or not meaningful to detect three occurrences of the pattern TAGATAA, where 
the letters appear consecutively and in the prescribed order? In computer network 
security, certain attacks can be detected by some well defined alarming sequences of 
events, though these events may be separated by perfectly legitimate actions. On an- 
other register, data mining aims at broadly categorizing electronic documents in an 
automatic way, and in this context the observation of well chosen patterns can provide 
highly discriminating criteria. These various applications require determining which 
patterns are, with high probability, bound to occur (these are not significant) and which 
are very unlikely to arise, so that actually observing them carries useful information. 
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Quantifying the corresponding probabilistic phenomena reduces to an enumerative 
problem—the case of double runs in Example 10 is in this respect typical. 

The notion of pattern can be formalized in several ways. In this book, we shall 
consider two of them: 


(a) Subsequence pattern: such a pattern is defined by the fact that its letter must 
appear in the right order, but not necessarily contiguously [215]. Subse- 
quence patterns are also known as “hidden patterns”. 

(b) Factor pattern: such a pattern is defined by the fact that its letter must appear 
in the right order and contiguously [254, 458]. Factor patterns are also called 
“block patterns” or simply “patterns” when the context is clear. 


For a given notion of pattern, there are then two related categories of problems. First, 
one may aim at determining the probability that a random word contains (or dually, 
excludes) a pattern; this problem is equivalently formulated as an existence problem— 
enumerate all words in which the pattern exists (i.e., occurs) independently of the 
number of occurrences. Second, one may aim at determining the expectation (or even 
the distribution) of the number of occurrences of a pattern in a random text; this prob- 
lem involves enumerating enriched words, each with one occurrence of the pattern 
distinguished. 

Such questions are amenable to methods of analytic combinatorics and in partic- 
ular to the theory of regular specifications and automata: see Example 11 below for a 
first analysis of hidden patterns (to be continued in Chapter V) and Example 12 for an 
analysis of factor patterns (to be further extended in Chapters HI, IV, and IX). 


EXAMPLE I.11. Subsequence (hidden) patterns in a text. A sequence of letters that occurs 
in the right order, but not necessarily contiguously in a text is said to be a “hidden pattern”. 
For instance the pattern “combinatorics” is to be found hidden in Shakespeare’s Hamlet (Act I, 
Scene 1) 


Dared to the [comb fat; which our va pliant] Hamlet— 
so th ifs side of our known world esteem’d him-— 


Did slay this Fortinbras; who by a seal’d compact, 
Well ratified by law and heraldry, 
Did forfeit, with his life, all those his lands 


Whi[c jh he [s tood seized of, to the conqueror... 


Take a fixed finite alphabet A comprising m letters (m = 26 for English). First, let 
us examine the language C of all words, also called “texts”, that contain a given word p = 
pip2:::pr of length k as a subsequence. These words can be described unambigously as 
starting with a sequence of letters not containing p; followed by the letter p; followed by a 
sequence not containing pe, and so on: 


L = SEQ(A \ p1)p1 SEQ(A \ p2)p2-+- SEQ(A \ pr)pr SEQ(A). 


This is in a sense equivalent to parsing words unambiguously according to the leftmost occur- 
rence of p as a subsequence. The OGF is accordingly 
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An easy analysis of the dominant simple pole at z = 1/m shows that 


1 
3 sothat L, ~ m”. 
z—1/m l-—mz n—0o 


~N 


Thus, a proportion tending to 1 of all the words of length n do contain p as a subsequecne. 


> 1.23. A refined analysis. Further consideration of the subdominant pole at z = 1/(m — 1) 
yields, by the methods of Chapter IV, the refined estimate: 


1-2 =0(nh (:-=) . 
m” m 


Thus, the probability of not containing a given subsequence pattern is exponentialy small. < 


A census (Note 24) shows that there are in fact 1.63 10°° occurrences of “combinatorics” 
as a subsequence hidden somewhere in the text of Hamlet, whose length is 120,057 (this is the 
number of letters that constitute the text). Is this the sign of a secret encouragement passed to 
us by the author of Hamlet? 

Here is an analysis of the expected number of hidden patterns based on enumerating en- 
riched words, where an enriched word is a word together with a distinguished occurrence of the 
pattern as a subsequence. Consider the regular specification 


O = SEQ(A) pi SEQ(A) p2 SEQ(A) --- SEQ(A) pe—1 SEQ(A) pe SEQ(A). 


Anelement of O is a (2k + 1)-tuple whose first component is an arbitrary word, whose second 
component is the letter p1, and so on, with letters of the pattern and free blocks alternating . In 
other terms, any w € O represents precisely one possible occurrence of the hidden pattern p in 
a text built over the alphabet A. The associated OGF is simply 


ok 


(1 — mz)F 


The ratio between the number of occurrences and the number of words of length n then equals 


(42) 9, — BIO _i,-* (:) . 


O(z) = 


m” k 
and this quantity represents the expected number of occurrences of the hidden pattern in a 
random word of length n, assuming all such words to be equally likely. For the parameters 
corresponding to the text of Hamlet (n = 120, 057) and the pattern “combinatorics” (k = 13), 
the quantity Q,, evaluates to 6.96 10°”. The number of hidden occurrences observed is thus 23 
times higher than what the uniform model predicts! However, similar methods make it possible 
to take into account nonuniform letter probabilities (Chapter III): based on the frequencies of 
letters in the English text itself, the expected number of occurrences is found to be 1.71 10°°— 
this is now only within 5% of what is observed. Thus, Shakespeare did not (probably) conceal 
in his text any message relative to combinatorics. .............. END OF EXAMPLEI.11. 


> 1.24. Dynamic programming. The number of occurrences of a subsequence pattern in a text 
can be determined efficiently by scanning the text from left to right and maintaining a running 
count of the number of occurrences of the pattern as well as all its prefixes. dq 


1.4.2. Finite automata. We begin with a simple device, the finite automaton, 
that is widely used in models of computation [149] and has wide descriptive power as 
regards structural properties of words’. 


7A far reaching treatment of automata and paths in graphs, involving both algebraic and asymptotic 
aspects, is given in Part B, Section V.5, p. 320. 
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b a a,b 
a b b 
O40 
a 


FIGURE I.9. Words that contain the pattern abb are recognized by a 4—state automaton 
with initial state go and final state q3. 


Definition I.11. A finite automaton is a directed multigraph whg3 lose edges are la- 
belled by letters of the alphabet A. It is customary to refer to vertices as states and to 
denote by Q the set of states. An initial state qg € Q and a set of final states Q ¢ © Q 
are designated. 

The automaton is said to be deterministic if for each pair (q, a) with q © Q and 
a € A there exists at most one edge (one also says a transition) starting from q that is 
labelled by the letter a. 


A finite automaton is able to process words, as we now explain. A word w = 
W 1... Wn iS accepted by the automaton if there exists a path in the multigraph con- 
necting the initial state qo to one of the final states of @ ¢ and whose sequence of edge 
labels is precisely w1,..., Wn. For a deterministic finite automaton, it suffices to start 
from the initial state go, scan the letters of the word from left to right, and follow at 
each stage the only transition permitted; the word is accepted if the state reached in 
this way after scanning the last letter of w is a final state. Schematically: 


A finite automaton thus keeps only a finite memory of the past (hence its name) and 
is in a sense a combinatorial counterpart of the notion of Markov chain in probability 
theory. In this book, we shall only consider deterministic automata. 

As an illustration, consider the class £ of all words w that contain the pattern 
abb as a factor (the letters of the pattern should appear contiguously). Such words 
are recognized by a finite automaton with 4 states, go, qi, d2, 3. The construction is 
classical: state q; is interpreted as meaning “the first 7 characters of the pattern have 
just been scanned’, and the corresponding automaton appears in Figure 9. The initial 
state is go, and there is a unique final state q3. 
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Definition 1.12. A language is said to be A-regular (automaton regular) if it coincides 
with the set of words accepted by a deterministic finite automaton. A class M is A- 
regular if for some regular language L, one has M = CL. 


> 1.25. Congruence languages. The language of binary representations of numbers that are 
congruent to 2 to modulo 7 is A-regular. A similar property holds for any numeration base and 
any boolean combination of basic congruence conditions. dq 


> 1.26. Binary representation of primes. The language of binary representations of prime 
numbers is neither A-regular nor S-regular. [Hint: this requires the Prime Number Theorem 
and asymptotic methods of Chapter IV.] dq 

The following equivalence theorem is briefly discussed in the Appendix (see AP- 
PENDIX A: Regular languages, p. 678). 


Equivalence theorem (Kleene—Rabin-Scott). A language is S—regular 
(specification regular) if and only if it is A-regular (automaton regular). 


These two equivalent notions also coincide with the notion of regularity in formal 
language theory (defined there by means of regular expressions and nondeterministic 
finite automata [3, 149]). As already pointed out, the equivalence is non-trivial: it 
is given by an algorithm that transforms one formalism into the other, but does not 
transparently preserve combinatorial structure (e.g., in some cases, an exponential 
blow up in the size of descriptions is involved). For this reason, we have opted to 
develop both notions of S-regularity and A-regularity in an independent way. 

We next examine the way generating functions can be obtained from a determin- 
istic automaton. The process was first discovered in the late 1950’s by Chomsky and 
Schiitzenberger [89]. 

Proposition I.3. Let G be a deterministic finite automaton with state set Q = {qo,.--,s}, 
initial state qo, and set of final states Q = {qi,,.--, Gi}. The generating function of 
the language L of all words accepted by the automaton is a rational function that is 
determined under matrix form as 


L(z) =uUl = 27)- +v. 

There the transition matrix T is defined by 

T;,; = card {a € A such that an edge (q;, q;) is labelled by a} ; 
the row vector wis the vector (1,0,0,...,0) and the column vector v = (v9,-.--,Us)é 
is such that v; = [qj € QI. 
In particular, by Cramer’s rule, the OGF of a regular language is the quotient of two 
sparse determinants whose structure directly reflects the automaton transitions. 
PROOF. For j € {0,...,s}, introduce the class (language) £; of all words w such 


that the automaton, when started in state q;, terminates in one of the final states after 
having read w. The following relation holds for any 7: 


(43) Ly gt | Ye} Ligeay ]3 
acA 


Sit proves convenient at this stage to introduce Iverson’s bracket notation: for a predicate P, the 
variable [| P] has value 1 if P is true and 0 otherwise. 
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there A; is the class {e} formed of the word of length 0 if q; is final and the empty 
set (J) otherwise; the notation (q; o a) designates the state reached in one step from 
state q; upon reading letter a. The justification is simple: a language £; contains the 
word of length 0 only if the corresponding state q; is final; a word of length > 1 that 
is accepted starting from state q; has a first letter a followed by a word that must lead 
to an accepting state when starting from state q; 0 a. 

The translation of (43) is then immediate: 


(44) L;(z) = [a € A +2 > Lejoay(2)- 
acA 


The collection of all the equations as j varies forms a linear system: with L(z) the 
column vector (Lo(z),..., £s(z)), one has 


L(z) =v+2TL(z), 


where v and T are as described in the statement. The result follows by matrix inversion 
upon observing that D(z) = Lo(z). 


The pattern abb. Consider the automaton recognizing the pattern abb as given in 
Figure 9. The languages £; (where £; is the set of accepted words when starting from 
state q;) are connected by the system of equations 


Lo = aly Tr bLo 
Ly = aly Te bL4 
Lo = aly be bL3 
L3 = aLlz3 ole bL3 +€, 


which directly reflects the graph structure of the automaton. This gives rise to a set of 
equations for the associated OGFs 


Lo = zLy 7c zLo 
Ty = zLy ar zLo 
Do = zLy Tr zLg 
D3 = zz Ie zbg +1. 


Solving the system, we find the OGF of all words containing the pattern abb: it is 
Lo(z) since the initial state of the automaton is qo, and 


(45) Lo(z) = 


The partial fraction decomposition 


1 2+2 1 
NO oe La 
then yields 

Lon = 2 Fr+3 +1, 

with F,, a Fibonacci number. In particular the number of words of length n that do 
not contain abb is F,43 —1, a quantity that grows at an exponential rate of y”, with 
y = (1+ /5)/2 the golden ratio. Thus, all but an exponentially vanishing proportion 
of the strings of length n contain the given pattern abb, a fact that was otherwise to 
be expected on probabilistic grounds. (For instance, from Note 29, a random word 
contains a large number, about ~ n/8, of occurrences of the pattern abb.) 
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> 1.27. Regular specification for pattern abb. The pattern abb is simple enough that one can 
come up with an equivalent regular expression describing Lo, whose existence is otherwise 
predicted by the Kleene-Rabin-Scott Theorem. An accepting path in the automaton of Figure 9 
loops around state 0 with a sequence of b, then reads an a, loops around state 1 with a sequence 
of a’s and moves to state 2 upon reading a b; then there should be letters making the automaton 
passs through states 1-2-1-2-----1-2 and finally a b followed by an arbitrary sequence of a’s 
and b’s at state 3. This corresponds to the specification 


Lo = SEQ(b) aSEQ(a)b SEQ(a SEQ(a)b) b SEQ(a + b) 


—— L oA et —,_—_, 
o(2) (12)? 5 (22) 


z 


which gives back a form equivalent to (45). dq 


EXAMPLE 1.12. Words containing or excluding a pattern. Fix an arbitrary pattern p = 
pip2:--pr and let £ be the language of words containing at least one occurrence of p as a 
factor. Automata theory implies that the set of words containing a pattern as a factor is A— 
regular, hence admits a rational generating function. Indeed, the construction given for p = abb 
generalizes in an easy manner: there exists a deterministic finite automaton with k + 1 states 
that recognizes £, the states memorizing at each stage the largest prefix of the pattern p just 
seen. As a consequence: The OGF of the language of words containing a given factor pattern 
of length k is a rational function of degree at most k + 1. (The corresponding automaton is in 
fact known as a Knuth—Morris—Pratt automaton [310].) The automaton construction however 
provides the OGF L(z) in determinantal form, so that the relation between this rational form 
and the structure of the pattern is not transparent. 


Autocorrelations. An explicit construction due to Guibas and Odlyzko [253] nicely cir- 
cumvents this problem. It is based on an “equational” specification that yields an alternative 
linear system. The fundamental notion is that of an autocorrelation vector. For a given p, this 
vector of bits c = (co,..., Ck—1) is most conveniently defined in terms of Iverson’s bracket as 


ci = [pit1Pi+2--- Dk = pip2- ++ pe—il- 


In other words, the bit c; is determined by shifting p right by 7 positions and putting a 1 if 
the remaining letters match the original p. Graphically, c; = 1 if the two framed factors of p 
coincide in 


pS mops 
Pest Pe =P. 
For instance, with p = aabbaa, one has 
aabbaa 
aabbaa 1 
aabbaa 0 
aabbaa 0 
aabbaa 0 
aabbaa 1 
aabbaa 1 
The autocorrelation is then c = (1,0, 0,0, 1,1). The autocorrelation polynomial is defined as 
k—-1 
c(z) := os C52’. 
j=0 


For the example pattem, this gives c(z) = 1+ 24+ 2°. 
Let S be the language of words with no occurrence of p and T the language of words that 
end with p but have no other occurrence of p. First, by appending a letter to a word of S, one 


I. 4. WORDS AND REGULAR LANGUAGES 57 


finds a nonempty word either in S or T, so that 
(46) S+T={eh+SxA. 


Next, appending a copy of the word p to a word in S may only give words that contain p at or 
“near” the end. Precisely, the decomposition based on the leftmost occurrence of p in Sp is 


(47) Sx {p} =T x S > {pais 1Pe—ite “Def, 
ci, AO 


corresponding to the configurations 


MAL Pel LITT 
TT eee 


——————— i 
T 
The translation of the system (46), (47) into OGFs then gives a system of two equations in the 
two unknown S,T, 
S+T=1+mz8, S-z* =Te(z), 
which is then readily solved. 
Proposition 1.4. The OGF of words not containing the pattern p as a factor is 
c(z 
(48) Be) =a apa +a af Sate 


where m is the alphabet cardinality, k = |p| the pattern length, and c(z) the autocorrelation 
polynomial of p. 


A bivariate generating function based on the autocorrelation polynomial is derived in 
Chapter II, from which is deduced the existence of a limiting Gaussian law for the number 
of occurrences of any pattern in Chapter IX. ................... END OF EXAMPLE I.12. 


> 1.28. At least once. The GFs of words containing at least once the pattern (anywhere) and 
containing it only once at the end are 


ze ze 


We) Gamer same) | Em 


respectively. dq 
> 1.29. Expected number of occurrences of a pattern. For the mean number of occurrences 
of a factor pattern, calculations similar to those employed for the number of occurrences of 


a subsequence (even simpler) can be based on regular specifications. All the occurrences 
fp = pip2--- pr as a factor are described by 


ak 


(1 — mz)?" 


Consequently, the expected number of such contiguous occurrences satisfies 


na 


O = SEQ(A) (pip2::: pr) SEQ(A), => O(z) = 


=~ = n 
(49) Qn =m" (n—k+1)~ —. 


Thus, the mean number of occurrences is proportional to n. dq 
> 1.30. Waiting times in strings. Let £L C SEQ{a, b} be a language and S = {a, b}°° be the set 
of infinite strings with the product probability induced by P(a) = P(b) = 4. The probability 


that a random string w € S starts with a word of L is L(1/2), where L(z) is the OGF of the 
“prefix language” of L, that is, the set of words w € C that have no strict prefix belonging to L. 
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The GF L(z) serves to express the expected time at which a word in C is first encountered: this 
is $L' ( 4). For a regular language, this quantity must be a rational number. dq 


> 1.31. A probabilistic paradox on strings. Ina random infinite sequence, a pattern p of length k 
first occurs on average at time 2*c(1/2), where c(z) is the autocorrelation polynomial. For 
instance, the pattern p = abb tends to occur “sooner” (at average position 8) than p’ = aaa (at 
average position 14). See [253] for a thorough discussion. Here are for instance the epochs at 
which p and p’ are first found in a sample of 20 runs 


p: 3, 4, 5, 5, 6, 6, 7, 8, 8, 8, 8, 9,9, 10, 11, 14, 15, 15, 16, 21 
p’: 3, 4,8,8,9, 10, 11, 11, 11, 12, 17, 22, 23, 27, 27, 27, 44, 47, 52, 52. 


On the other hand, patterns of the same length have the same expected number of occurrences, 
which is puzzling. The catch is that, due to overlaps of p’ with itself, occurrences of p’ tend to 
occur in clusters, but, then, clusters tend to be separated by wider gaps than for p; eventually, 
there is no contradiction. dq 


> 1.32. Borges’s Theorem. Take any fixed set II of finite patterns. A random text of length n 
contains all the patterns of the set II (as factors) with probability tending to 1 exponentially 
fast as n — oo. (Reason: the rational functions S(z/2) with Sz) as in (48) have no pole 
in |z| < 1; see also Chapters IV, V.) 

Note: similar properties hold for many random combinatorial structures. They are some- 
times called “Borges’s Theorem” as a tribute to the famous Argentinian writer Jorge Luis Borges 
(1899-1986) who, in his essay “The Library of Babel”, describes a library so huge as to contain: 
“Everything: the minutely detailed history of the future, the archangels’ autobiographies, the 
faithful catalogues of the Library, thousands and thousands of false catalogues, the demonstra- 
tion of the fallacy of those catalogues, the demonstration of the fallacy of the true catalogue, the 
Gnostic gospel of Basilides, the commentary on that gospel, the commentary on the commen- 
tary on that gospel, the true story of your death, the translation of every book in all languages, 
the interpolations of every book in all books.” dq 


In general, automata are useful in establishing a priori the rational character of 
generating functions. They are also surrounded by interesting analytic properties 
(e.g., Perron-Frobenius theory, Chapter IV, that characterizes the dominant poles) 
and by asymptotic probability distributions of associated parameters that are normally 
Gaussian. They are most conveniently used for proving existence theorems, then sup- 
plemented when possible by regular specifications, which are likely to lead to more 
tractable expressions. 


> 133. Variable length codes. A finite set F C W, where W = SEQ(A) is called a code if 
any word of W decomposes in at most one manner into factors that belong to F (with repeti- 
tions allowed). For instance F = {a, ab, bb} is a code and aaabbb = ala|ab|bb has a unique 
decomposition; F’ = {a, aa, b} is not a code since aaa = alaa = aala = ala|a. The OGF of 
the set S¢ of all words that admit a decomposition into factors all in F is a computable rational 
function, irrespective of whether F is a code. (Hint: use an “Aho—Corasick” automaton [4].) A 
finite set F is a code iff S¢(z) = (1 — F(z))~'. Consequently, the property of being a code 
can be decided in polynomial time using linear algebra. The book by Berstel and Perrin [46] 
develops systematically the theory of such variable-length codes. <q 


I. 4.3. Related constructions. Words can, at least in principle, encode any com- 
binatorial structure. We detail here one example that demonstrates the usefulness of 
such encodings: it is relative to set partitions and Stirling numbers. The point to be 
made is that some amount of “combinatorial preprocessing” is sometimes necessary 
in order to bring combinatorial structures into the framework of symbolic methods. 
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aBlyd 

ay| Bo 

a 
a |6| By 

[Blavs] 

ylaBe B : a 

blaBby ead 


FIGURE I.10. The 15 ways of partitioning a four-element domain into blocks corre- 
spond to sy = 1, gO? = 7, s®) = 6, 3 =1 


Set partitions and Stirling partition numbers. A set partition is a partition of a 
finite domain into a certain number of nonempty sets, also called blocks. For instance, 
if the domain is D = {a, 3, y, 5}, there are 15 ways to partition it (Figure 10). Let 
SO denote the collection of all partitions of the set [1 ..n] into 7 non-empty blocks 
and sf) = card( oy the corresponding cardinality. The basic object under consid- 
eration here is a set partition (not to be confused with integer partitions considered 
earlier). 

It is possible to find an encoding of partitions in § of an n-set into r blocks by 
words over ar letter alphabet, B = {b,, b2,...,b,} as follows. Consider a set partition 
c that is formed of r blocks. Identify each block by its smallest element called the 
block leader; then sort the block leaders into increasing order. Define the index of 
a block as the rank of its leader amongst all the r leaders, with ranks conventionally 
starting at 1. Scan the elements 1 to n in order and produce sequentially n letters from 
the alphabet B: for an element belonging to the block of index 7, produce the letter b;. 

For instance to n = 6, r = 3, the set partition @ = {{6, 4}, {5, 1, 2}, {3, 7, 8}}, 
is reorganized by putting leaders in first position of the blocks and sorting them, 

bi by bs 


wm = {{1, 2,5}, {3,7, 8}, {4, 6}}, 


so that the encoding is 
1234567 8 
Gi by bz b3 by b3 be i) 


In this way, a partition is encoded as a word of length n over 6 with the additional 
properties that: (i) all r letters occur; (ii) the first occurrence of b; precedes the first 
occurrence of bz which itself precedes the first occurrence of bs, etc. Thus Si”) is 
mapped into words of length n in the language 


by SEQ(b1)-b2 SEQ(by + bz): bg SEQ(b1 +52 +b3) +++ by SEQ(b1 +b2 +-++4+,). 
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Graphically, this correspondence can be rendered by an “irregular staircase” represen- 
tation, like 
4 - 6 = 
es ae oer 
1 2 5 
where the staircase has length n and height 7, each column contains exactly one ele- 
ment, each row corresponds to a class in the partition. 
The language specification immediately gives the OGF 


Van 


ney (i— 2) — 2z)(1—32)---(1— rz) 


The partial fraction expansion of S$“) (z) is readily computed, 


ey oe ee) it aft \ 
(r) = (r) _ r n 
2 @-5e (ae ear a (3) 
j=0 j=1 


In particular, one has 


~(2" —2); $@ = 


GQ) — 4. g@) ~ = 


Gia Dra 8), 


These numbers are known as the Stirling numbers of the second kind, or better, as 
the Stirling partition numbers, and the sf) are nowadays usually denoted by eae 
see APPENDIX A: Stirling numbers, p. 680. 

The counting of set partitions could eventually be done successfully thanks to an 
encoding into words, and the corresponding language forms a constructible class of 
combinatorial structures (actually a regular language). In the next chapter, we shall 
examine another approach to the counting of set partitions that is based on labelled 
structures and exponential generating functions. 


Circular words (necklaces). Let A be a binary alphabet, viewed as comprised 
of beads of two distinct colours. The class of circular words or necklaces (p. 18 and 
Equation (17)) is defined by a CYC composition: 


- _ ye lk) 1 
The series starts as (ETS A000031) 
N(z) = 22 +827 + 423 + 624 + 82° + 147° + 2027 + 362° + 6029 +---, 
and the OGF can be expanded: 
1 
Nn == pen 
(50) » (k) 


It turns out that V,, = D, + 1 where D,, is the wheel count, p. 45. [The connection is 
easily explained combinatorially: start from a wheel and repaint in white all the nodes 
that are not on the basic circle; then fold them onto the circle.] The same argument 
proves that the number of necklaces over an m-ary alphabet is obtained by replacing 2 
by m in (50). 
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> 1.34. Finite languages. Viewed as a combinatorial object, a finite language is a set of 
distinct words, with size being the total number of letters of all words in X. For a binary 
alphabet, the class of all finite languages is thus 


FL = PSET(SEQs,(A)) = FL(z) = ex S al be 
~ zi iat kb 1-22 
k>1 
The series is (EIS A102866) 1 + 22 + 52? + 162? + 4224 + 1162z° + 3102°4+.---. <J 


I.5. Tree structures 


This section is concerned with basic tree enumerations. Trees are, as we saw 
already, the prototypical recursive structure. The corresponding specifications nor- 
mally lead to nonlinear equations (and systems of such equations) over generating 
functions. The Lagrange inversion theorem is useful in solving the simplest category 
of problems. The functional equations furnished by the symbolic method are then 
conveniently exploited by the asymptotic theory of Chapters VI and VII. A certain 
type of analytic behaviour appears to be universal in trees, namely a we —singularity; 
accordingly, as we shall see, most trees families occurring in the combinatorial world 
have counting sequences obeying an asymptotic form C’_A"n~3/? that widely extends 
what we know already for Catalan numbers (p. 36). 


1.5.1. Plane trees. Trees are commonly defined as undirected acyclic connected 
graphs. In additions, the trees considered in this book are, unless specified otherwise, 
rooted. In this subsection, we focus attention on plane trees, also sometimes called 
ordered trees, where subtrees dangling from a node are ordered between themselves. 
Alternatively, these trees may be viewed as abstract graph structures accompanied by 
an embedding into the plane (see APPENDIX A: Tree concepts, p. 681 and[306, §2.3]). 
They are precisely described in terms of a sequence construction. 

First, consider the class G of general plane trees where all node degrees are al- 
lowed (this repeats p. 33): we have 


z 
(51) G=2xSE9) = Ge) = Taq 
1-yvyl1-4 
and, accordingly, G(z) = a so that the number of general trees of size n 
is a Catalan number: 
1 /2n—2 (2n — 2)! 
Ci al Cane, ~ Al(n—1)V 


Many classes of trees defined by all sorts of constraints on properties of nodes 
appear to be of interest in combinatorics and in related areas like logic and computer 
science. Let 2 be a subset of the integers that contains 0. Define the class T of 
()-restricted trees as formed of trees such that the outdegrees of nodes are constrained 
to lie in Q. In what follows, an essential rdle is played by a characteristic function that 


encapsulates Q, 
(u) = S- ur. 
wEQ 


62 I. UNLABELLED STRUCTURES AND ORDINARY GENERATING FUNCTIONS 


Thus, 2 = {0,2} determines binary trees, where each node has either 0 or 2 de- 
scendants, and ¢(u) = 1 + u?; the choices 2 = {0,1,2} and Q = {0,3} determine 
respectively unary-binary trees (¢(u) = 1+u-+u?) and ternary trees (6(u) = 1+u°); 
the case of general trees corresponds to 2 = Zso and ¢(u) = (1—u)7!. 
Proposition I.5. The ordinary generating function T(z) of the class T° of Q- 
restricted trees is determined implicitly by the equation 


T(z) = 2 (T(z), 


where © is the characteristic of Q, namely ¢(u) := )i,.q U*. The tree counts are 
given by 

1 
(52) T? = [27% (z) = = [uJ ou)”. 

n 


PROOF. Clearly, for -restricted sequences, we have 
A=SEQo(B) — A(z) = o(B(z)), 
so 
TS ee SeOg(T |) => T° (z) = 26(T(z)). 
This shows that T’ = T° is related to z by functional inversion: 
T 
= WD 
The Lagrange Inversion Theorem precisely provides expressions for such a case (see AP- 
PENDIX A: Lagrange Inversion, p. 677): 


Lagrange Inversion Theorem. The coefficients of an inverse function and 
of all its powers are determined by coefficients of powers of the direct func- 
tion: if z = T/(T), then 


The theorem immediately implies (52). 


The statement extends trivially to the case where 2. is a multiset of integers, that 
is, a set of integers with repetitions allowed. For instance, Q = {0,1,1,3} corre- 
sponds to unary-ternary trees with two types of unary nodes, say, having one of two 
colours; in this case, the characteristic is ¢(u) = u° + 2u! + u3. The theorem gives 
back the enumeration of general trees, where #(u) = (1 — u)~1, by way of the bino- 
mial theorem applied to (1—w)~”. In general, it implies that, whenever 2 comprises r 
elements, 2 = {w1,...,w,}, the tree counts are expressed as an (r — 1)-fold summa- 
tion of binomial coefficients (use the multinomial expansion). An important special 
case detailed below is when ( has only two elements. 


> 1.35. Forests. Consider ordered k-forests of trees defined by F = SEQ,,{T }. The Biirmann 
form of Lagrange inversion implies 


EF (2) = "IP (@)* = Flu] oy”. 
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In particular, one has for forests of general trees (¢(u) = (1 — u)~'): 


i") (Es) eae 


the coefficients are also known as “ballot numbers”’. J 


EXAMPLEI.13. “Regular” (t-ary) trees. A tree is said to be t-regular or t-ary if 2 consists 
only of the elements {0, ¢}. In other words, all internal nodes have degree t exactly, hence the 
name (Figure 11). Let A := Tt}. In an element of A, a node is either terminal or it has 
exactly ¢ children. In this case, the characteristic is ¢(u) = 1 + u* and the binomial theorem 
combined with the Lagrange inversion formula gives 


1 


An = =u" ](+ul)” 


1 . 
== () provided n = 1 mod t. 
TON a 

As the formula shows, only trees of total size of the form n = tv + 1 exist (a well-known fact 
otherwise easily checked by induction), and 


1 tv+1 1 tv 
oo un = ho v )-tim("). 


A particular rdle is played by 2-regular trees known as binary trees. Then a form equivalent 
to (53) reads: 


The number of plane binary trees having a total of 2v + 1 nodes (i.e., v binary nodes 


and v + 1 external nodes) is the Catalan number C, = = gay 


In this book, we shall use B to denote the class of binary trees. Size will be freely measured, 
depending on context and convenience, by recording internal, external, or all nodes. 

There is a variant of the determination of (53) that avoids congruence restrictions. Let A 
be the class of t-ary trees and define the class A of “pruned” trees as trees of A deprived of 
all their external nodes. The trees in A now have nodes that are of degree at most t. In order 
to make A bijectively equivalent to A , it suffices to regard trees of A as having () possible 


types of nodes of degree j for any 7 € [0, t]: each node type in A plainly encodes which of the 
original t — 7 subtrees have been pruned. The equations above immediately generalize to the 
case of an 2 with multiplicities. One finds ¢(u) = (1+ u)* and A(z) = z¢(A(z)), so that, by 


Lagrange inversion, 
ae, _ i ( tv ) 
vy\v—-1 


yet another equivalent form of (53), since, by basic combinatorics, A, = Atw+i1. END OF EXAMPLEI.13. 


> 1.36. Motzkin numbers. Let M(z) be the generating function for unary-binary trees (Q = 
{0, 1, 2}): 

— l-z-Vv1— 22-32? 

= 2Qz , 


One has z) = 242? 42244244929 4 212545127 4.---. The coefficients 
M,, = [z"|M(z) are given in Lagrange form as 


M(z) = 2(1+ M(z) + M(z)’) = M(z) 
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FIGURE I.11. A general tree of Gs: (left) and a binary tree of qi (right) drawn 
uniformly at random amongst the C59 and C25 possible trees respectively, with C, = 


aT (*) the nth Catalan number. 


and called Motzkin numbers (EJS A001006). <J 


> 1.37. Yet another variant of t-ary trees. Let A be the class of t-ary trees, but with size now 
defined as the number of external nodes (leaves). Then, one has 


The binomial form of A,, follows from Lagrange inversion, since A = z/(1 — A*~'). <q 


EXAMPLE 1.14. Hipparchus of Rhodes and Schréder. n 1870, the German mathematician 
Ernst Schréder (1841-1902) published a paper entitled Vier combinatorische Probleme. The 
paper had to do with the number of terms that can be built out of n variables using nonasso- 
ciative operations. In particular, the second of his four problems asks for the number of ways 
a string of n identical letters, say x, can be “bracketted’”. The rule is best stated recursively: 
x itself is a bracketting and if 01,02,...,0% with k > 2 are bracketted expressions, then the 
k-ary product (01)(02)--- (ox) is a bracketting. 

Let S denote the class of all brackettings, where size is the number of variables. Then, the 
recursive definition is readily translated into the formal specification 


To each bracketting of size n is associated a tree whose external nodes contain the variable x 

(and determine size), with internal nodes corresponding to brackettings and having degree at 

least 2 (while not contributing to size). The functional equation satisfied by the OGF is then 
S(z)" 

1— S(z)’ 

This is not a priori of the type corresponding to Proposition I.5 because not all nodes contribute 


to size in this particular application. However, the quadratic equation induced by (55) can be 
solved, giving 


S(z) = + ( +2-V1-62+2) 


4 
= gt 2743294 1127 + 452° + 19726 + 90327 + 427928 + 20793z° 


+ 10304927 + 518859211 +... , 


(55) S(z)=2+ 


where the coefficients are EJS A001003. (These numbers also count series-parallel networks of 
a specified type (e.g., serial in Figure 12, bottom), where placement in the plane matters.) 

In an instructive paper, Stanley [448] discusses a page of Plutarch’s Moralia where there 
appears the following statement: 
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(11) A (2 V (@3 Ava A 25) V 46) A ((27 A a8) V (9 A X10)) 


FIGURE I.12. An and-or positive proposition of the conjunctive type (top), its associ- 
ated tree (middle), and an equivalent planar series-parallel network of the serial type (bot- 
tom). 


“Chrysippus says that the number of compound propositions that can be made from 
only ten simple propositions exceeds a million. (Hipparchus, to be sure, refuted this 
by showing that on the affirmative side there are 103,049 compound statements, and 
on the negative side 310,952.)” 


It is notable that the tenth number of Hipparchus of Rhodes? (c. 190-120B.C.) is precisely 
Sio0 = 103,049. This is, for instance, the number of logical formulae that can be formed from 
ten boolean variables x1, ..., £10 (used once each and in this order) using and—or connectives 
in alternation (no “negation”), upon starting from the top in some conventional fashion (e.g, 
with an and-clause); see Figure 12!°. Hipparchus was naturally not cognizant of generating 
functions, but with the technology of the time (and a rather remarkable mind!), he would still 
be able to discover a recurrence equivalent to (55), 


(56) Sn = [n > 2] s Sn Sng ++ Sn, | + [n=], 


where the sum has only 42 essentially different terms for n = 10 (see [448] for a discussion), 
and finally determine Sj. 1.0.6... ccc eee c eee ence eee END OF EXAMPLE I. 14. 


> 1.38. The Lagrangean form of Schréder’s GF. The generating function S(z) admits the form 


S(z) = 20(S(2)) where 4(y) = == 


This was first observed by David Hough in 1994; see [448]. In [256], Habsieger et al. further note 
that $(Sto + S11) = 310, 954, and suggest a related interpretation (based on negated variables) for the 
other count given by Hipparchus. 

‘0Any functional term admits a unique tree representation. Here, as soon as the root type has been 
fixed (e.g., an A connective), the others are determined by level parity. The constraint of node degrees > 2 
in the tree means that no superfluous connectives are used. Finally, any monotone boolean expression can 
be represented by a series-parallel network: the x; are viewed as switches with the true and false values 
being associated with closed and open circuits, respectively. 
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Tree variety 12345 6 8 = +00 


Plane gen. G = Z x SEQ(G) ~ 41 ans 
hein oe ea 
Unord. gen. H = Z x MSET(H) eee ~ +B" /[n3/? 
Unord. bin. U = Z + MSET2(U) i Gae Se eee | ro: BS /n3/? 


FIGURE I.13. The number of rooted trees of type plane/unordered and general/binary 
for n = 1..8 and the corresponding asymptotic forms. There, AX = 0.43992, 6 = 
2.95576 for unordered general (EJS A000081); A2 = 0.31877, G2 = 2.48325 for un- 
ordered binary. For binary trees (EJS 4001190), size is, by convention here, the number of 
external nodes. 


is the OGF of compositions. Consequently, one has 


nt, eas 1—u\" 
Bet ee ale (5) 


k 
~ 1S (2n-—k=2)\ (n-2 
on n-1 bk ] 
k=0 


Is there a direct combinatorial relation to compositions? <q 


> 1.39. Faster determination of Schréder numbers. By forming a differential equation satisfied 
by S(z) and extracting coefficients, one obtains a recurrence 


(n + 2)Sn42 — 3(2n + 1)Sn41 + (n-—1)Sn =0, n>1, 
that entails a fast determination (in linear time) of the S;,. In contrast, Hipparchus’s recurrence 
implies an algorithm of complexity e°(v™) in the number of arithmetic operations involved. <] 


1.5.2. Nonplane trees. An unordered tree, also called nonplane tree, is a tree 
in the general graph—theoretic sense, so that there is no order distinction between 
subtrees emanating from a common node. The unordered trees considered here are 
furthermore rooted, meaning that one of the nodes is distinguished as the root. Ac- 
cordingly, in the language of constructible structures, a rooted unordered tree is a root 
node linked to a multiset of trees. Thus, the class 1 of all unordered trees, admits the 
recursive specification: 


H(z) =z [[@Q-2")* 
H=2ZxMSET(H) => m=1 f i 
= zexp (H(z) + 52’) + 3 H(z") +--+), 


The first form of the OGF was given by Cayley in 1857 [54, p. 43]; it does not ad- 
mit a closed form solution, though the equation permits one to determine all the H,, 
recursively (EJS 4000081) 


H(z) =z4+ 27 +223 4 4244 92° + 202° + 4827 + 11528 + 28629 + «-- 
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In addition, the local analysis of the singularities of H(z) yields a bona fide asymptotic 
expansion for H,,, a fact first discovered by Pélya [397] who proved that 


(57) Hy, ~ \- 


for some positive constants \ = 0.43992 and @ = 2.95576. The “universality” in tree 
enumerations of such estimates, of the form A"n~?/? is a major theme of Chapter VII. 


> 1.40. Fast determination of the Cayley—Pélya numbers. Logarithmic differentiation of the 
equation satisfied by H(z) provides for the H,, a recurrence that permits one to compute H,, 
in time polynomial in n. (Note: a similar technique applies to the partition numbers P,,; see 
p. 40.) 


The enumeration of the class of trees defined by an arbitrary set Q of node degrees 
immediately results from the translation of sets of fixed cardinality. 


Proposition 1.6. Let Q C N be a finite set of integers containing 0. The OGF U(z) of 
nonplane trees with degrees constrained to lie in Q satisfies a functional equation of 
the form 

(58) Ue = gO 2) U2"), UP Vy aca) 

for some computable polynomial ®. 


PROOF. The class of trees satisfies the combinatorial equation, 


U = Z x MSET9(U) (seen = 5> Msert,, w) : 
wEQ 


where the multiset construction reflects non-planarity, since subtrees stemming from 
a node can be freely rearranged between themselves and may appear repeated. Theo- 
rem I.3 (p. 78) provides the translation of MSET;,(//): 


&(U(z),U(z7), U(z3),...) = S- [u’] exp (Fu) + SU) pas +) ; 


wEQ 


The result follows. 

Once more, there are no explicit formulae but only functional equations implicitly 
determining the generating functions. However, as we shall see in Chapter VII, the 
equations may be used to analyse the dominant singularity of U(z). It is found that a 
“universal” law governs the singularities of simple tree generating functions that are 
of the type \/1 — z/p, corresponding to a general asymptotic scheme (see Figure 13), 


(Go)" 

Vn8 © 

Many of these questions have their origin in combinatorial chemistry, starting with 
Cayley in the 19th century [54, Ch. 4]. Polya reexamined these questions, and in 
his important paper published in 1937 [395] he developed at the same time a general 
theory of combinatorial enumerations under group actions and of asymptotics methods 
giving rise to estimates like (59). See the book by Harary and Palmer [259] for more 
on this topic or Read’s edition of Pélya’s paper [397]. 


(59) U2 ~ Ae 
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> 1.41. Binary nonplane trees. Unordered binary trees with size measured by the number of 
external nodes are described by the equation ’¢ = Z + MSET2(U/). The functional equation 
determining U(z) is 


(60) U(z) = 2+ 5U(2)? + Sue"); U(z) =z4+27 422422443224... 


The asymptotic analysis of the coefficients (EJS 4001190) was carried out by Otter [382] who 
established an estimate of type (59). (The values of the constants are summarized in Figure 13.) 
The quantity U;,, is also the number of structurally distinct products of m elements under a 
commutative nonassociative binary operation. dq 


> 1.42. Hierarchies. Define the class K of hierarchies to be trees without nodes of outdegree 1 
and size determined by the number of external nodes. The corresponding OGEF satisfies (Cayley 
1857, see [54, p.43]) 


K(e)=52+5 lexp (xe) + 5K(2) +: ) a | 


from which the first values are found (E7S A000669) 


K(z) =z+27 +229 +524 4 122° + 332° + 902" + 2612® + 7662z° + 231229 +... 


These numbers also enumerate hierarchies in statistical classification theory [475]. They are the 
non-planar analogues of the Hipparchus—Schréder’s numbers on p. 64. 


> 1.43. Nonplane series-parallel networks. Consider the class SP of series-parallel networks 
as previously considered in relation to Hipparchus of Rhodes’ example, p. 65, but ignoring 
planar embeddings. Thus, all parallel arrangements of the (serial) networks s1,..., $s, are con- 
sidered equivalent, while the linear arrangement in each serial network matters. For instance, 
for n = 2,3: 


o--o- |-o-| 
|-o- | 


Thus, SP: = 2 and SP3 = 5. This is modelled by the grammar: 
S = Z+SEQso(P), P = Z+MSETS2(S), 
and, to avoid counting networks of one element twice, 
SP(z) = S(z)+ P(z)—2 = 24227 +52" +152" 4 482° + 1672" + 6022" +. 2256z° +... . 
This is EJS A003430. The objects are usually described as networks of electric resistors. << 


1.5.3. Related constructions. Trees underlie recursive structures of all sorts. A 
first illustration is provided by the fact that the Catalan numbers, C,, = oH (7”) 
count general trees (G) of size n + 1, binary trees () of size n (if size is defined as 
the number of internal nodes), as well as triangulations (J) comprised of n triangles. 
The combinatorialist John Riordan even coined the name Catalan domain for the area 
within combinatorics that deals with objects enumerated by Catalan numbers, and 
Stanley’s book contains an exercise [449, Ex. 6.19] whose statement alone spans ten 
full pages, with a list of 66 types of objects(!) belonging to the Catalan domain. We 
shall illustrate the importance of Catalan numbers by describing a few fundamental 
correspondences that explain the occurrence of Catalan numbers in several areas of 
combinatorics. 
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Rotation of trees. The combinatorial isomorphism relating G and B (albeit with 
a shift in size) coincides with a classical technique of computer science [306, §2.3.2]. 
To wit, a general tree can be represented in such a way that every node has two types 
of links, one pointing to the leftmost child, the other to the next sibling in left-to-right 
order. Under this representation, if the root of the general tree is left aside, then every 
node is linked to two other (possibly empty) subtrees. In other words, general trees 
with n nodes are equinumerous with pruned binary trees with n — 1 nodes: 


Gn, a Bn-1. 


Graphically, this is illustrated as follows: 


The rightmost tree is a binary tree drawn in a conventional manner, following a 45° 
tilt. This justifies the name of “rotation correspondence” often given to this transfor- 
mation. 


Tree decomposition of triangulations. The relation betwen binary trees 6 and 
triangulations JT is equally simple: draw a triangulation; define the root triangle as 
the one that contains the edge connecting two designated vertices (for instance, the 
vertices numbered 0 and 1); associate to the root triangle the root of a binary tree; 
next, associate recursively to the subtriangulation on the left of the root triangle a left 
subtree; do similarly for the right subtriangulation giving rise to a right subtree. 


Under this correspondence, tree nodes correspond to triangle faces, while edges con- 
nect adjacent triangles. What this correspondence proves is the combinatorial isomor- 
phism 


T= By: 
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We turn next to another type of objects that are in correspondence with trees. 
These can be interpreted as words encoding tree traversals and, geometrically, as paths 
in the discrete plane Z x Z. 


Tree codes and Lukasiewicz words. . Any tree can be traversed starting from 
the root, proceeding depth-first (and left-to-right), and backtracking upwards once a 
subtree has been completely traversed. For instance, in the tree 


(61) T= 


the first visits to nodes take place in the following order 


a, b, d, h, e, 2 C, g, t, j- 


(Note: the tags a, b,... added for convenience in order to distinguish nodes have no 
special meaning; only the abstract tree shape matters here.) This order is known as 
preorder or prefix order since a node is preferentially visited before its children. 

Given a tree, the listing of the outdegrees of nodes in prefix order will be called 
the preorder degree sequence. For the tree of (61), this is 


o = (2,3,1,0,0,0,1,2,0,0). 


It is a fact that the degree sequence determines the tree unambiguously. Indeed, given 
the degree sequence, the tree is reconstructed step by step, adding nodes one after the 
other at the leftmost available place. For o, the first steps are then 


Next, if one represents degree j by a “symbol” f;, then the degree sequence becomes 
a word over the infinite alphabet F = { fo, fi,...}, for instance, 


o~ fofshifofofofifefofo- 


This can be interpreted in logical language as a denotation for a functional term built 
out symbols from F, where f; represents a function of degree (or “arity”) 7. The 
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correspondence even becomes obvious if superfluous parentheses are added at appro- 
priate place to delimitate scope: 


a ~ fo(fs(fi(fo), fo, fo); fi(fe(fo, fo)))- 


Such codes are known as Lukasiewicz codes!!, in recognition of the work of the 
Polish logician with that name. Jan Lukasiewicz (1878-1956) introduced them in 
order to completely specify the syntax of terms in various logical calculi; they prove 
nowadays basic in the development of parsers and compilers in computer science. 

Finally, a tree code can be rendered as a walk over the discrete lattice Z x Z. 
Associate to any f; (ie., any node of outdegree 7) the displacement (1, 7-1) € Zx Z, 
and plot the sequence of moves starting from the origin. On the example one finds: 


fo fs fi fo fo fo far fo fo fo 


1 2 0 -1 -1 -1 0 1 -1 -1 
There, the last line represents the vertical displacements. The resulting paths are 
known as Lukasiewicz paths. Such a walk is then characterized by two conditions: 
the vertical displacements are in the set {—1,0,1, 2,...}; all its points, except for the 
very last step, lie in the upper half-plane. 
By this correspondence, the number of Lukasiewicz paths with n steps is the 
shifted Catalan number, aay: 
> 1.44. Conjugacy principle and cycle lemma. Let L be the class of all Lukasiewicz paths. 
Define a “relaxed” path as one that starts at level 0, ends at level —1 but is otherwise allowed 


arbitrary negative steps; let \/ be the corresponding class. Then, each relaxed path can be 
cut-and-pasted uniquely after its leftmost minimum as described here: 


This associates to every relaxed path of length v a unique standard path. A bit of combinatorial 
reasoning shows that correspondence is 1-to-v (each element of £ has exactly v preimages.) 
One thus has MM, = vL,. This correspondence preserves the number of steps of each type 
(fo, f1,---), so that the number of Lukasiewicz paths with v; steps of type f; is 


1 = Vv mee vy 1 Vv 
— |x tupe ur? +++] (ae ‘uo tur + aru2+27ug +:--) -1 ) 
V Y \V0O,V1,.-- 


11A Jess dignified name is “Polish prefix notation”. The “reverse Polish notation” is a variant based 
on postorder that has been used in some calculators since the 1970’s. 
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under the necessary condition (—1)vo + 011 + 1v2 + 2v3 +--+ = —1. This combinatorial way 
of obtaining refined Catalan statistics is known as the conjugacy principle [407] or the cycle 
lemma [98, 121, 145]. Raney has derived from it a purely combinatorial proof of the Lagrange 
inversion formula [407] while Dvoretzky & Motzkin [145] have employed this technique to 
solve a number of counting problems related to circular arrangements. dq 


EXAMPLE I.15. Binary tree codes and Dyck paths. Walks associated with binary trees have 
a very special form since the vertical displacements can only be +1 or —1. The resulting 
paths of Lukasiewicz type are then equivalently characterized as sequences of numbers x = 
(%0,%1,.-.,;€2n, £2n+41) satisfying the conditions 


(62) 20 =0; vj; >0 forl<j <2n; |ej41 — 23|=1; Lenti = —l. 


These coincide with “gambler ruin sequences”, a familiar object from probability theory: a 
player plays head and tails. He starts with no capital (a = 0) at time 0; his total gain is x; at 
time 7; he is allowed no credit (x; > 0) and loses at the very end of the game t2n41 = —1; his 
gains are +1 depending on the outcome of the coin tosses (|v;+1 — xj| = 1). 

It is customary to drop the final step and consider ‘excursions’ that take place in the upper 
half-plane. The resulting objects defined as sequences (v0 = 0,21,...,22n = O) satisfying 
the first three conditions of (62) are known in combinatorics as Dyck paths'*. By construc- 
tion, Dyck paths of length 2n correspond bijectively to binary trees with n internal nodes and 
are consequently enumerated by Catalan numbers. Let D be the combinatorial class of Dyck 
paths, with size defined as length. This property can also be checked directly: the quadratic 


decomposition 
SW act 
(63) 
D = fe +(ADN\)xD 
=> Dz) = 1 4+ (z2D(z)z) D(z). 
From this OGF, the Catalan numbers are found (as expected): Don = wT eae The decom- 


position (63) is known as the “first passage” decomposition as it is based on the first time the 
cumulated gains in the coin-tossing game pass through the value zero. 

Dyck paths also arise in connection will well-parenthetized expressions. These are rec- 
ognized by keeping a counter that records at each stage the excess of the number of opening 
brackets ‘(’ over closing brackets ‘)’. Finally, one of the origins of Dyck path is the famous 
ballot problem, which goes back to the nineteenth century [346]: there are two candidates A 
and B that stand for election, 2n voters, and the election eventually results in a tie; what is the 
probability that A is always ahead of or tied with B when the ballots are counted? The answer 


is 
Dan 1 


Ger 
since there are Cs) possibilities in total, of which the number of favorable cases is Dan, a 
Catalan number. The central réle of Dyck paths and Catalan numbers in problems coming from 
such diverse areas is quite remarkable. Section V.3, p. 295 presents refined counting results 
regarding lattice paths (e.g., the analysis of height) and Subsection VII. 8.1, p. 482 introduces 


'Dyck paths are closely associated with free groups on one generator and are named after the German 
mathematician Walther (von) Dyck (1856-1934) who introduced free groups around 1880. 
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exact and asymptotic results in the harder case of an arbitrary finite collection of step types (not 
just +1). 
besa “as es geo WNT eb o Za carls sai. dy 4) rae “Si ah o ia tagacieede teense had: bere. cd vat ageatiie ete Hane END OF EXAMPLE I.15. 


> 1.45. Dyck paths, parenthesis systems, and general trees. The class of Dyck paths admits an 
alternative sequence decomposition 


a a Oy 


D = SEQ(ZxD x Z), 
which again leads to the Catalan GF. The decomposition (64) is known as the “arch decom- 
position” (see Subsection V.3.1, p. 296, for more). It can also be directly related to traversal 
sequences of general trees, but with the directions of edge traversals being recorded (instead of 
traversals based on node degrees): fo a general tree 7, define its encoding «(7) over the binary 
alphabet {/”, \,} recursively by the rules: 


K(T) =e, K(e(71,...,T7r)) =7 K(T1)-+-K(tr) \. 


This is the classical representation of trees by a parenthesis system (interpret ‘/”’” and ‘\,’ as 
‘( and ‘)’, respectively), which associates to a tree of n nodes a path of length 2n — 2. dq 


> 1.46. Random generation of Dyck paths. Dyck paths of length 2n can be generated uniformly 
at random in time linear in n. (Hint: By Note 44, it suffices to generate uniformly a sequence 
of n a’s and n 4+ 1 b’s, then reorganize it according to the conjugacy principle.) J 


> 1.47. Excursions, bridges, and meanders. Adapting a terminology from probability theory, 
one sets the following definitions: (2) a meander (M) is a word over {—1, +1}, such that the 
sum of the values of any of its prefixes is always a nonnegative integer; (i7) a bridge (B) is a 
word whose values of letters sum to 0. Thus a meander represents a walk that wanders in the 
first quadrant; a bridge, regarded as a walk, may wander above and below the horizontal line, 
but its final altitude is constrained to be 0; an excursion is both a meander and a bridge. Simple 
decompositions provide 


_ _ PD) = 1 
fae) 1—2zD(z)’ Be 1 —222D(z)’ 
implying Mn = (),."7)) [EIS 4001405] and Bon = (77°) [ETS 4000984]. <J 


> 1.48. Motzkin paths and unary-binary trees. Motzkin paths are defined by changing the 
third condition of (62) defining Dyck paths into |v;+1 —2,;| < 1. They appear as codes for 
unary-binary trees and are enumerated by the Motzkin numbers of Note 36. <q 


EXAMPLE I.16. The complexity of boolean functions. Complexity theory provides many 
surprising applications of enumerative combinatorics and asymptotic estimates. In general, one 
starts with a finite set of mathematical objects 2. and a combinatorial class D of descriptions. 
By assumption, to every object of 6 € D is associated an element (5) € Q, its “meaning”; 
conversely any object of 2 admits at least one description in D, that is, the function pi is surjec- 
tive. It is then of interest to quantify properties of the shortest description function defined for 
w € Nas 
o(w) = min {|5|p | p(6) = w} : 

and called the complexity of element of Q (with respect to D). 

We take here (2 to be the class of all boolean functions on m variables. Their number is 
|Q| = 2?”. As descriptions, we adopt the class of logical expressions involving the logical 
connectives V,/ and pure or negated variables. Equivalently, D is the class of binary trees, 


74 I, UNLABELLED STRUCTURES AND ORDINARY GENERATING FUNCTIONS 


where internal nodes are tagged by a logical disjunction (‘V’) or a conjunction (‘A’), and each 
external node is tagged by either a boolean variable of {21,..., 2m} or a negated variable of 
{7271, writs 7Lm}. Define the size of a tree description as the number of internal nodes, that is, 
the number of logical operators. Then, one has 

1 2n 


=n Lon, ntl1 
(65) Da=(——(7") | -2" myn, 


as seen by counting tree shapes and possibilities for internal as well as external node tags. 
The crux of the matter is that if the inequality 


(66) SD; < IQ, 
g=0 


holds, then there are not enough descriptions of size < v to exhaust (. In other terms, there 
must exist at least one object in 2 whose complexity exceeds v. If the left side of (66) is 
much smaller than the right side, then, it must even be the case that “most” Q-objects have a 
complexity that exceeds v. 

In the case of boolean functions and tree descriptions, the asymptotic form (24) is available. 
There results from (65) that, for n, v getting large, one has 


Dn = O(16"m"n-?/?), sD; ae O(16"m"v~*/?), 
j=0 
Choose v such that the second expression is 0(|{||). This is ensured for instance by taking for 
v the value 
Qi 
v(m) = ——_.,, 
4+ log, m 

as verified by a simple asymptotic calculation. With this choice, one has the following sugges- 
tive statement: 

A fraction tending to 1 (asm — co) of boolean functions in m variables have tree 

complexity at least 2" / logs m. 

Regarding upper bounds on boolean function complexity, a function always has a tree 

complexity that is at most 2t* — 3. To see it, note that for m = 1, the 4 functions are 


0=(@1A7%1), Ll=(%1V7%71), 21, 7271. 


Next, a function of m variables is representable by a technique known as the binary decision 
tree (BDT), 
f(@1,---,;%m—1,@m) = (7am A f(#1,..-,;Lm—1,0)) V (am A f(21,-.-,2m-1,1)), 

which provides the basis of the induction as it reduces the representation of an m-ary func- 
tion to the representation of two (m — 1)-ary functions, consuming on the way three logical 
connectives. 

Altogether, basic counting arguments have shown that “most” boolean functions have 
a tree-complexity that is “close” to the maximum possible, namely, O(2). A similar re- 
sult has been established by Shannon for the measure called circuit complexity: circuits are 
more powerful than trees, but Shannon’s result states that almost all boolean functions of m 
variables have circuit complexity O(2™/m). See [481], especially the chapter by Li and 
Vitanyi, for a discussion of such counting techniques within the framework of complexity the- 
OLY. eas SARA cao das va eee tea aad END OF EXAMPLEI.16. 
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1.5.4. Context-free specifications and languages. Many of the combinatorial 
examples encountered so far in this section can be organized into a common frame- 
work, which is fundamental in formal linguistics and theoretical computer science. 


Definition I.13. A class C is said to be context-free if it coincides with the first com- 
ponent (T = S,) of a system of equations 


S} — ®1(Z,S),...,S,) 
(67) 
S, = ©,(Z,S1,...,S;), 


where each ® ; is a constructor that only involves the operations of combinatorial sum 
(+) and cartesian product (x), as well as the neutral class, E = {e}. 

A language L is said to be an unambiguous context-free language if it is combi- 
natorially isomorphic to a context-free variety of trees: C = T. 


The classes of general trees (G) and binary trees (8) are context-free, since they 
are specifiable as 


G = Z2xF 
F = {}+(GxF) 


B=2Z2+(8x B). 


I 


(F designates ordered forests of general trees.) Context-free specifications may be 
used to describe all sorts of combinatorial objects. For instance, the class 7 of trian- 
gulations of convex polygons is specified symbolically by 


(68) T=V+(VxXT)+(7 xV)4+(TxVxT), 


where V represents a generic triangle. The Lukasiewicz language and the set of Dyck 
paths are context-free classes since they are bijectively equivalent to G and T. 

The term “context-free” comes from linguistics: it stresses the fact that objects 
can be “freely” generated by the rules of (67), this without any constraints imposed 
by an outside context!?. There, one clasically defines a context-free language as the 
language formed with words that are obtained as sequences of leaf tags (read in left- 
to-right order) of a context-free variety of trees. In formal linguistics, the one-to-one 
mapping between trees and words is not generally imposed; when it is satisfied, the 
context-free language is said to be unambiguous; then, words and trees determine each 
other uniquely, cf Note 50 below. 

An immediate consequence of admissibility theorems is the following proposition 
first encountered by Chomsky and Schiitzenberger [89] in the course of their research 
relating formal languages and formal power series: 


'3Formal language theory also defines context-sensitive grammars where each rule (called a produc- 
tion) is applied only if it is enabled by some external context. Context-sensitive grammars have greater 
expressive power than context-free ones, but they depart significantly from decomposability and are sur- 
rounded by strong undecidability properties. Accordingly context-sensitive grammars cannot be associated 
to any global generating function formalism. 
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Proposition I.7. A combinatorial class C that is context-free admits an OGF that is 
an algebraic function. In other words, there exists a (non-null) bivariate polynomial 
P(z,y) € C[z, y] such that 

P(z,C(z)) =0. 
PROOF. By the basic sum and product rules, the context-free system (67) translates 
into a system of OGF equations, 


Si(z) = Wi(z,Si(z),...,S-(z)) 


S,(z) = W,(z,S1(z),...,5,(z)), 


where the VW; are the polynomials translating the constructions © ;. 

It is then well-known that algebraic elimination is possible in polynomials sys- 
tems. Here, it is possible to eliminate the auxiliary variables S2,...,5;,, one by one, 
preserving the polynomial character of the system at each stage. The end result is 
then a single polynomial equation satisfied by C(z) = 5S1(z). (Methods for effec- 
tively performing polynomial elimination include a repeated use of resultants as well 
as Groebner basis algorithms; see APPENDIX B: Algebraic elimination, p. 685 for a 
brief discussion and references.) 

Proposition I.7 is a counterpart of Proposition I.3 (54) according to which rational 
generating functions arise from finite state devices, and it explains the importance of 
algebraic functions in enumerative theory. We shall develop a general asymptotic the- 
ory of coefficients of algebraic functions in Chapter VII, based on singularity theory. 


> 1.49. “Tree-like” structures. A context-free specification can always be regarded as defining a 
class of trees. Indeed, if the jth term in the construction ®; is “coloured” with the pair (2, 7), it is 
seen that a context-free system yields a class of trees whose nodes are tagged by pairs (7, 7) ina 
way consistent with the system’s rules (1.13). However, despite this correspondence, it is often 
convenient to preserve the possibility of operating directly with objects when the tree aspect 
is unnatural. (Some authors have developed a parallel notion of “object grammars”; see for 
instance [144] itself inspired by techniques of polyomino surgery in [116].) By a terminology 
borrowed from the theory of syntax analysis in computer science, such trees are referred to as 
“parse trees” or “syntax trees”. <q 


> 1.50. Context-free languages. Let A be a fixed finite alphabet whose elements are called 
letters. A grammar G is a collection of equations 


Ly, = Wi(a,Li,...,Lm) 
(69) G : : 
Ln. SS. Worl @, Listas) 
where each WV; involves only the operations of union (U) and catenation product (-) with @ the 
vector of letters in A. For instance, 
Wi(d, £1, £2, £3) = a2: Lo-L3Ua3UL3- a2: L1. 


A solution to (69) is an m-tuple of languages over the alphabet .A that satisfies the system. By 
convention, one declares that the grammar G defines the first component, £1. 

To each grammar (69), one can associate a context-free specification (51) by transforming 
unions into disjoint union, ‘U’ +> ‘+’, and catenation into cartesian products, ‘-’ ++ ‘x’. Let 


G be the specification associated in this way to the grammar G. The objects described by G 
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appear in this perspective to be trees (see the discussion above regarding parse trees). Let h 


be the transformation from trees of G to languages of G that lists letters in infix (i.e., left-to- 
right) order: we call such an h the erasing transformation since it “forgets” all the structural 
information contained in the parse tree and only preserves the succession of letters. Clearly, 


application of h to the combinatorial specifications determined by ei yields languages that obey 


the grammar G. For a grammar G and a word w € A*, the number of parse trees t € G 
such that h(t) = w is called the ambiguity coefficient of w with respect to the grammar G; this 
quantity is denoted by Kg(w). 

A grammar G is unambiguous if all the corresponding ambiguity coefficients are either 0 


or 1. This means that there is a bijection between parse trees of G and words of the language 
described by G: each word generated is uniquely “parsable” according to the grammar. In such 
a case, the OGFs of languages satisfy a polynomial system of the form (52). <q 


> 1.51. Extended context-free specifications. If A, B are context-free specifications then: (7) the 
sequence class C = SEQ(A) is context-free; (iz) the substitution class D = A[b +> B] is also 
context-free. <J 


I.6. Additional constructions 


This section is devoted to the constructions of sequences, sets, and cycles in the 
presence of restrictions on the number of components as well as to mechanisms that 
enrich the framework of core constructions, namely, pointing, substitution, and the 
use of implicit combinatorial definitions. 


I. 6.1. Restricted constructions. An immediate formula for OGFs is that of the 
diagonal A of a cartesian product 6 x B defined as 


A= A(B x B) := {(6,8) | 6 € B}. 
Then, clearly Az, = By, so that 
A(z) = B(z?). 


The diagonal construction permits us to access the class of all unordered pairs of 
(distinct) elements of B, which is A = PSET2(6). A direct argument then runs as 
follows: the unordered pair {a, 3} is associated to the two ordered pairs (a, 3) and 
(3, a) except when a = (3, where an element of the diagonal is obtained. In other 
words, one has the combinatorial isomorphism, 


PSET2(B) + PSET2(B) + A(B x B) = Bx B, 
meaning that 
2A(z) + B(z?) = B(z)*. 
The resulting translation into OGFs is thus 
A=PSET(B) =  Alz)= 5 Ble) . 5 B="), 


Similarly, for multisets, we find 


A=MseT,(B) = A(z)= 5 Ble) Lene), 
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while for cycles one has CYC2 = MSETg, and 


A = CyC2(B) => A(z) = 5 Be) + 5B(2). 


This type of direct reasoning could be extended to treat triples, and so on, but the 
computations (if not the reasoning) tend to grow out of control. An approach based 
on multivariate generating functions generates simultaneously all cardinality restricted 
constructions. 


Theorem I.3 (Component-restricted constructions). The OGF of sequences with k 
components A = SEQ;(B) satisfies 


A(z) = B(z)*. 
The OGF of sets, A = PSET;(B), is a polynomial in the quantities B(z),..., B(z*), 


2 3 
The OGF of multisets, A = MSET;,(B), is 


U wu u> 
A(z) = [u*] exp (Fae) —- BG) 4 —_ Byes -). 


U u2 u> 
Ale) = [hls (222) +L) + LB) +). 


1 2 3 
The OGF of cycles, A = CYC, (B), is 
\ (e) 1 
A(z) = lu® y( 1 
©) = (FP oe aH 


The explicit forms for small values of & are summarized in Figure 14. 
PROOF. The result for sequences is obvious since SEQ;(6) means B x --- x B(k 
times). For the other constructions, the proof makes use of the techniques of Theo- 
rem I.1, but it is best based on bivariate generating functions that are otherwise devel- 
oped fully in Chapter III to which we refer for details. The idea consists in describing 
all composite objects and introducing a supplementary marking variable to keep track 
of the number of components. 

Take & to be a construction amongst SEQ, CYC, MSET, PSET, set A = R(B), 
and let y(@) for a € A be the parameter “number of B—components”. Define the 
multivariate quantities 


An,k = card{ae A | |al=n, x(a) =k} 
Ateyu) = SoAngutat = 3 alolante, 
n,k acA 


For instance, a direct calculation shows that, for sequences, there holds 


A(z,u) = Ss u® B(z)* 


k>0 
1 


1—uB(z) 
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For multisets and powersets, a simple adaptation of the already seen argument gives 
A(z, u) as 


A(z,u) = [a = uz) Be, A(z,u) = [[a + uz"), 


n n 


respectively. The result follows from there by the exp-log transformation upon ex- 
tracting [u*].A(z, u). The case of cycles results from the bivariate generating function 
for cycles derived in APPENDIX A: Cycle construction, p. 674. 


> 1.52. Sets with distinct component sizes. Let A be the class of the finite sets of elements from 
B, with the additional constraint that no two elements in a set have the same size. One has 


Similar identities serve in the analysis of polynomial factorization algorithms [186]. dq 


> 1.53. Sequences without repeated components. The generating function is formally: 


[ox Si(-1y 2 Ae') e “du. 


0 jP1 Jj 


(This form is based on the Eulerian integral: k! = i hee e “uF du.) J 


1.6.2. Pointing and substitution. Two more constructions, namely pointing and 
substitution, translate agreeably into generating functions. Combinatorial structures 
are viewed here as formed of “atoms” (words are composed of letters, graphs of nodes, 
etc) which determine their sizes. In this context, pointing means “pointing at a distin- 
guished atom”; substitution, written B o C or B[C], means “substitute elements of C 
for atoms of B”. 


Definition 1.14. Let {€1,€2,...} be a fixed collection of distinct neutral objects of 
size 0. The pointing of a class B, noted A = OB, is formally defined by 


OB =) Bu fey scnieah. 


n>0 


The substitution of C into B (also known as composition of B and C), noted BoC 
or BIC], is formally defined as 


BoC =BIC]:= 5 ~ By x SEQ (C). 


k>0 


If B,, is the number of 6 structures of size n, then nB,, can be interpreted as 
counting pointed structures where one of the n atoms composing a B-structure has 
been distinguished (here by a special “pointer” of size 0 attached to it). Elements of 
BoC may also be viewed as obtained by selecting in all possible ways an element 
@ © B and replacing each of its atoms by an arbitrary element of C. 
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The interpretations above rely (silently) on the fact that atoms in an object can 
be eventually distinguished from each other. This can be obtained by “canonicaliz- 
ing” the representations of objects: first define inductively the lexicographic order- 
ing for products and sequences; next represent powersets and multisets as increasing 
sequences with the induced lexicographic ordering (more complicated rules can also 
canonicalize cycles). In this way, any constructible object admits a unique “rigid” 
representation in which each particular atom is determined by its place. Such a canon- 
icalization thus reconciles the abstract definition, Definition I.14, and the intuitive 


interpretation of pointing and substitution. 


Theorem I.4 (Pointing and substitution). The constructions of pointing and substitu- 
tion are admissible": 


A=0B = A(z) =20,B(z) O,:= es 
A=BoC = A(z) = B(C(z)) 
PROOF. By the definition of pointing, one has 
A, =n: By and A(z) = 2 Ble), 


From the definition of substitution, A = B[C] implies, by the sum and product rules, 


A(z) = D0 Bu - (O(2))* = B(C(2)), 


k>0 


and the proof is completed. 

Permutations as pointed objects. As an example of pointing, consider the class P 
of all permutations written as words over integers starting from 1. One can go from a 
permutation of size n — 1 to a permutation of size n by selecting a “gap” and inserting 
the value n. When this is done in all possible ways, it gives rise to the combinatorial 
relation 


P=E+O0(Z xP), E = {e}, => P(z)=1+ 2 (2P(2)). 


This means that the OGF satisfies an ordinary differential equation whose formal so- 
lution is P(z) = D0.) n!z”. 

Unary-binary trees as substituted objects. As an example of substitution, con- 
sider the class B of (plane rooted) binary trees, where all nodes contribute to size. If 
at each node there is substituted a linear chain of nodes (linked by edges placed on top 
of the node), one forms an element of the class M of unary-binary trees; in symbols: 


M=BoSkQs,(Z) => m(z)=B (222), 


'4such canonicalization techniques also serve to develop fast algorithms for the exhaustive listing 
of objects of a given size as well as for the range of problems known as “ranking” and “unranking”, with 
implications in fast random generation. See, e.g., [352, 373, 495] for the general theory as well as [405, 510] 
for particular cases like necklaces and trees. 

15ty this book, we borrow from differential algebra the convenient notation 0, := a to represent 
derivatives. 
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Thus from the known OGF, B(z) = (1 — V1 — 4z?)/(2z), one derives 


MGV 1—/1— 422(1 — z)-? 2 apt Sonik seni oe 
2z2(1— z)-1 2z 
which matches the direct derivation on p. 63 (Motzkin numbers). 


> 1.54. Combinatorics of derivatives. The combinatorial operation D of “eraser—pointing” 
points to an atom in an object and replaces it by a neutral object, otherwise preserving the 
overall structure of the object. The translation of D on OGFs is then simply 0 = Oz. Classical 
identities of analysis then receive simple combinatorial interpretations, for instance, 


ane x B)= : x OB) + (OA) x B); 
Leibniz’s identity, 0” (f-g) a \(” -(0™—4q), also follows from basic combinatorics. 
Similarly, for the “chain rule” -A(fo Q= ie 0g): Og. dq 


1.6.3. Implicit structures. There are many cases where a combinatorial class V 
is determined by a relation A = B + 4%, where A and B are known. In terms of 
generating functions, one has A(z) = B(z) + X(z), so that 

A=B4+X = X(z)=A(z)—- Bl). 
For instance, the autocorrelation technique of Section I.4.2 makes it possible to de- 
scribe the class S of all words in W that do not contain a given pattern p, whereas 
the language of words containing the pattern is determined as the solution in ¥ of the 
equation W = S + 4; see p. 56. Similarly, for products, basic algebra gives 
A(z) 
Blz) 


Here are the corresponding solutions for two of the composite constructions. 


Theorem I.5 (Implicit specifications). The generating functions associated to the im- 
plicit equations in X 
A = SEQ(“X), A= MSET(*) 
are respectively 
1 uk 
X(z) = 1- —~ 
@=t- ge, x)= DO tog ach), 
k>1 


where 1(k) is the Mébius function. 


PROOF. For sequences, the relation A(z) = (1 — X(z))~? is readily inverted. For 
multisets, start from the fundamental relation of Theorem I.1 and take logarithms: 


1 
og(A =px ; 
Let L = log A and L, = [z”|L(z). One has 


nLn = )_(dXa), 
d|n 


to which it suffices to apply Mobius inversion; see APPENDIX A: Arithmetical func- 
tions, p. 667. 
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EXAMPLE I.17. = Indecomposable permutations. A permutation 0 = o1---On (written 
here as a word of distinct letters) is said to be decomposable if, for some k < n, 01---+o% is 
a permutation of {o1,...,ox}, ie. a strict prefix of the permutation is itself a permutation. 
Any permutation decomposes uniquely as a catenation of indecomposable permutations; for 
instance, here is the decomposition of o = 25413687109: 


tof | T tT TT | fof 


o -[25413][6] [8079] 


Thus the class P of all permutations and the class Z of indecomposable ones are related by 
P = SEQ(Z). 


This determines [(z) implicitly, and Theorem I.5 gives: 


I2)=1- 55 where P(z) = So nlz”. 


This example illustrates the implicit structure theorem, but also the possibility of bona fide 
algebraic calculations with power series even in cases where they are divergent (APPENDIX A: 
Formal power series, p. 676). One finds 


T(z) = 2427432941324 +712 4+ 461 2° 4 344727 4---, 
where the coefficients are EJS A003319 and 


In =n! — a (ni!ne!) + be (ni!n2!n3!) —--- 


nitng=n nytng+n3=n 
ny ng2l ny ng ngrl 


From there, simple majorizations of the terms imply that I, ~ n!, so that almost all permuta- 
tions are indecomposable; see [98, p. 262]. ...............0 20 END OF EXAMPLE I.17. 


> 1.55. 2-dimensional wanderings. A drunkard starts from the origin in the Z x Z plane and, 
at each second, he makes a step in either one of the four directions, NW, NE, SW, SE. The steps 
are thus \,, /7,,“, \,. Consider the class £ of “primitive loops” defined as walks that start and 
end at the origin, but do not otherwise touch the origin. The GF of £ is (EJS 4002894) 


ree ieee 
sae Ge bes 


L(y S1= = 427 +2024 +176 2° + 18762 +---. 

(Hint: a walk is determined by its projections on the horizontal and vertical axes; 1-dimensional 
walks that return to the origin in 2n steps are enumerated by (*”).) In particular [z”]L(z/4) is 
the probability that the random walk first returns to the origin in n steps. 

Such problems largely originate with Pélya and the implicit structure technique above 
was most likely known to him [396]. See [69] for similar multidimensional extensions. The 
first return problem is analysed asymptotically in Chapter VI, based on singularity theory and 
Hadamard closure properties. <q 
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EXAMPLE 1.18. — Irreducible polynomials over finite fields. | Objects not obviously of a 
combinatorial nature can sometimes be enumerated by symbolic methods. Here is an indirect 
construction relative to polynomials over finite fields. We fix a prime number p and consider the 
base field F,, of integers taken modulo p. The polynomial ring F,[X] is the ring of polynomials 
in X with coefficients taken in F,. For all practical purposes, one may restrict attention to 
polynomials that are monic, that is, whose leading coefficient is 1. 

First, let P be the class of all monic polynomials, with the size of a polynomial being its 
degree. Since a monic polynomial of degree n is described by a choice of n coefficients, one 
has 


P ~ SEQ(F;) = P(z) and P, =p”. 


=~ = pz 
A polynomial is said to be irreducible if it does not decompose as a product of two polynomials 
of smaller degrees. By unique factorization, each monic polynomial decomposes uniquely into 
a product (with repetitions being possible) of monic irreducible polynomials. For instance, over 
F3, one has 
KO RO eee (ay (KD) BOR? 1): 
Let I be the set of monic irreducible polynomials. The combinatorial isomorphism 
P = MSET(Z) 


expresses precisely the unique factorization property. Thus, the irreducibles are determined 
implicitly from the class of all polynomials whose OGF is known. Theorem I.5 implies the 
identity 


102) = tog 


[i 
k>1 


and, upon extracting coefficients, 
1 k 
i np 
nd u(k)p 


In particular, J, is asymptotic to p”/n. This estimate constitutes the density theorem for irre- 
ducible polynomials: 


The fraction of irreducible polynomials amongst all polynomials of degree n over 
the finite field F , is asymptotic to 1. 


This property is analogous to the Prime Number Theorem of number theory (which is tech- 


nically much harder [107]), according to which the proportion of prime numbers in the inter- 


val [1, n] is asymptotic to ara (The result was known to due to Gau’. See Knopfmacher’s 


book [297] for an abstract discussion of statistical properties of arithmetical semigroups.) 
END OF EXAMPLEI.18. 


> 1.56. Square-free polynomials. Let Q be the class of monic square-free polynomials (i.e., 
polynomials not divisible by the square of a polynomial). One has by “Vallée’s identity” (p. 29) 


Q(z) = P(z)/P(2’), hence 
_1- pz 


Q) = Toor and = Qn=p"—p™* (n> 2). 


Berlekamp’s book [41] discusses such facts together with relations to error correcting codes. <J 


> 1.57. Balanced trees. The class € of balanced 2-3 trees contains all the (rooted planar) trees 
whose internal nodes have degree 2 or 3 and such that all leaves are at the same distance from 
the root. Only leaves contribute to size. Such trees, which are particular cases of B-trees, are a 
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useful data structure for implementing dynamic dictionaries [307, 433]. Balanced trees satisfy 
an implicit equation based on combinatorial substitution: 


ES ZPE(S KZ) AZ KS x SZ) = E(z) =z+ E(z? + 2°). 
The expansion starts as (EIS A014535) E(z) = 24+2?4+ 2942442294228 432744284 


5 29+8 219+... . Odlyzko [375] has determined the growth of E;, to be roughly y" /n, where 
y = (1+ V5)/2 is the golden ratio. Cf Section IV. 7.2, p. 267 for a partial analysis. dq 


I.7. Perspective 


This chapter and the next amount to a survey of elementary combinatorial enu- 
merations, organized in a coherent manner and summarized in Figure 14. We refer to 
the process of specifying combinatorial classes using these constructions and then au- 
tomatically having access to the corresponding generating functions as the symbolic 
method. The symbolic method is the “combinatorics” in analytic combinatorics: it 
allows us to organize classical results in combinatorics with a unifying overall ap- 
proach, to derive new results that generalize and extend classical problems, and to 
address new classes of problems that are arising in computer science, computational 
biology, statistical physics, and other scientific disciplines. 

More important, the symbolic method leaves us with generating functions that we 
can handle with the “analytic” part of analytic combinatorics. A full treatment of this 
feature of the approach is premature, but a brief discussion may help place the rest of 
the book in context. 

For a given class of problems, the symbolic method typically leads to a unified 
treatment that reveals a natural class of functions in which generating functions lie. 
Even though the symbolic method is completely formal, we can often successfully 
proceed by using classical techniques from complex and asymptotic analysis. For 
example, denumerants with a finite set of coin denominations always lead to ratio- 
nal generating functions with poles on the unit circle. Such an observation is useful 
since then a common strategy for coefficient extraction can be applied (partial fraction 
expansion, in the case of denumerants with fixed coin denominations). In the same 
vein, the run statistics constitute a particular case of the general theorem of Chomsky 
and Schiitzenberger to the effect that the generating function of a regular language 
is necessarily a rational function. Theorems of this sort establish a bridge between 
combinatorial analysis and special functions. 

Not all applications of the symbolic method are automatic (though that is certainly 
a goal underlying the approach). The example of counting set partitions shows that 
application of the symbolic method may require finding an adequate presentation of 
the combinatorial structures to be counted. In this way, bijective combinatorics enters 
the game in a nontrivial fashion. 

Our introductory examples of compositions and partitions correspond to classes 
of combinatorial structures with explicit “iterative” definitions, a fact leading in turn to 
explicit generating function expressions. The tree examples then introduce recursively 
defined structures. In that case, the recursive definition translates into a functional 
equation that only determines the generating function implicitly. In simpler situations 
(like binary or general trees), the equation can be solved and explicit counting results 
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1. The main constructions of disjoint union (combinatorial sum), product, sequence, 
set, multiset, and cycle and their translation into generating functions (Theorem I.1). 
Construction 
Union A=B+C 
Product A=BxC 
Sequence A = SEQ(B) 


Powerset A = SET(B) 


Multiset A= MSET(B) 
Cycle 


2. The translation for sets, multisets, and cycles constrained by the number of compo- 
nents (Theorem I.3, p. 78). 


SEQ, 


PSET2 


2 
MSeT2(B): 2g 4 200 


2 
Cyc2(B): 24) 4 2¢) 


PSET3 BE) BE) 4 


a a 
MSET3 2 ae BG) Bie) d4 1 


_ 2B(z3) 


CYC3 = 


B(z)?B(z?) | B(z)B(z3) 1 
~~ 4 3 


PSET4 
4 2 5 3 
MSET4 : ( Biz) Bt da Bitte D Ge 


B(z?)? D3 BGA) 


CyYc4 : a+ 5 . 


3. The additional constructions of pointing and substitution (Section I. 6). 


Construction 


Substitution A= BoC | A(z) = B(C(z)) 


FIGURE I.14. A dictionary of constructions applicable to unlabelled structures, together 
with their translation into ordinary generating functions (OGFs). (The labelled counterpart 
of this table appears in Figure 16 of Chapter II, p. 137.) 
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still follow. In other cases (like non-planar trees) one can usually proceed with com- 
plex asymptotic analysis directly from the functional equation and obtain very precise 
asymptotic estimates; see Chapters [V—VII. 

Analytic combinatorics is characterized by the focus on constructions that leave 
us with generating functions that yield to classical techniques in complex analysis and 
asymptotic analysis. For some combinatorial classes, as we shall see, we have the- 
orems that carry us all the way from purely combinatorial constructions through to 
asymptotic estimates for counting sequences, under general assumptions. For others, 
the general theorems are yet to be proved, but the symbolic method lays the ground- 
work for analysis that leads to the results that we seek. 

Modern presentations of combinatorial analysis appear in the books of Comtet [98] (a 
beautiful book largely example-driven), Stanley [447, 449] (a rich set with an algebraic orienta- 
tion), Wilf [496] (generating functions oriented), and Lando [326] (a neat modern introduction). 
An elementary but insightful presentation of the basic techniques appears in Graham, Knuth, 
and Patashnik’s classic [248], a popular book with a highly original design. An encyclope- 
dic reference is the book of Goulden & Jackson [244] whose descriptive approach very much 
parallels ours. 

The sources of the modern approaches to combinatorial analysis are hard to trace since they 
are usually based on earlier traditions and informally stated mechanisms that were well mastered 
by practicing combinatorial analysts. (See for instance MacMahon’s book [350] Combinatory 
Analysis first published in 1917, the introduction of denumerant generating functions by Polya 
as exposed in [398], or the “domino theory” in [248, Sec. 7.1].) One source in recent times is 
the Chomsky-—Schiitzenberger theory of formal languages and enumerations [89]. Rota [414] 
and Stanley [445, 449] developed an approach which is largely based on partially ordered sets. 
Bender and Goldman developed a theory of “prefabs” [34] whose purposes are similar to the 
theory developed here. Joyal [286] proposed an especially elegant framework, the “theory of 
species”, that addresses foundational issues in combinatorial theory and constitutes the starting 
point of the superb exposition by Bergeron, Labelle, and Leroux [39]. Parallel (but independent) 
developments by the “Russian School” are nicely synthetized in the books by Sachkov [420, 
421]. 

One of the reasons for the revival of interest in combinatorial enumerations and proper- 
ties of random structures is the analysis of algorithms (a subject founded in modern times by 
Knuth [309]), where the goal is to predict the performance characteristics of computer pro- 
grams. The symbolic ideas exposed here have been applied to the analysis of algorithms in 
surveys [175, 486] and are further exposed in our book [434]. Flajolet, Salvy, and Zimmer- 
mann [206] have shown how to use them in order to automate the analysis of some well charac- 
terized classes of combinatorial structures. Even more recently, several researches in statistical 
physics, computational biology, and other scientific disciplines have been drawn towards the 
study of the sorts of discrete models that can be specified by the sorts of combinatorial construc- 
tions that we have described, and therefore are candidates for study via analytic combinatorics. 
Research in these fields are the driving force in the study of new kinds of constructions on the 
combinatorics side that lead to new methods on the analytic side. 
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Many objects of classical combinatorics present themselves naturally as labelled struc- 
tures where atoms of an object (typically nodes in a graph or a tree) are distinguishable 
from one another by the fact that they bear distinct Jabels. Without loss of generality, 
we may take the set from which labels are drawn to be the set of positive integers. 
For instance, a permutation can be viewed as a linear arrangement of distinct labels; 
its cycle decomposition represents it as an unordered collection of circular directed 
graphs whose nodes are labelled by integers. 

Operations on labelled structures are based on a special product: the labelled 
product that distributes labels between components. This operation is a natural ana- 
logue of the cartesian product for plain unlabelled objects. The labelled product in 
turn leads to labelled analogues of the sequence, set, and cycle constructions. 

Labelled constructions translate over exponential generating functions. The trans- 
lation schemes turn out to be analytically even simpler than in the unlabelled case 
considered in the previous chapter. At the same time, labelled constructions enable 
us to take into account structures that are in many ways combinatorially richer than 
their unlabelled counterparts, in particular as regards order properties. They constitute 
another facet, with powerful descriptive powers, of the symbolic method for combi- 
natorial enumeration. 

In this chapter, we examine some of the most important classes of labelled objects, 
including surjections, set partitions, permutations, labelled graphs and labelled trees, 
as well as graphs and mappings from a finite set into itself. Certain aspects of words 


lerhis approach eliminates virtually all calculations.” 
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can also be treated by this theory, a fact which has numerous consequences not only 
in combinatorics itself but also in probability and statistics. In particular, labelled 
constructions of words can be put to use in order to elegantly solve two classical 
problems, the birthday problem and the coupon collector problem, as well as several 
of their variants that have numerous applications in other fields, including the analysis 
of hashing algorithms in computer science. 


II. 1. Labelled classes 


Throughout this chapter, we consider combinatorial classes in the sense of Chap- 
ter I: we deal exclusively with finite objects; a combinatorial class A is a set of objects, 
with a notion of size attached, so that the number of objects of each size in A is finite. 
To these basic concepts, we now add the idea that the objects are labelled, by which 
we mean that each atom carries with it a distinctive colour, or equivalently an integer 
label, in such a way that all the labels occurring in an object are distinct. Precisely: 


Definition 11.1. A weakly labelled object of size n is a graph whose set of vertices 
is a subset of the integers. Equivalently, we say that the vertices bear labels, with 
the implied condition that labels are distinct integers from Z. An object of size n is 
said to be well-labelled, or simply labelled, if it is weakly labelled and, in addition, 
its collection of labels is the complete integer interval {1 ..n]. A labelled class is a 
combinatorial class comprised of well-labelled objects. 


The graphs considered may be directed or undirected. In fact, when the need 
arises, we shall take “object” to mean any kind of discrete structure enriched by in- 
teger labels. Virtually all labelled classes considered in this book can eventually be 
encoded as graphs of sorts, so that this extended use of the notion of a labelled class 
is a harmless convenience. (See Section I. 7 for a brief discussion of alternative but 
logically equivalent frameworks for the notion of a labelled class.) 


EXAMPLE II.1. Labelled graphs. A labelled graph is by definition an undirected graph 
such that distinct integer labels forming an interval of the form {1, 2,...,} are supported by 
vertices. A particular labelled graph of size 4 is then 
1—3 
oS ill js 
4—2 
which represents a graph whose vertices bear the labels {1, 2,3, 4} and whose set of edges is 


{ {1,3}, {2,3}, {2,4}, {1,4} }. 
Only the graph structure (as defined by its set of edges) counts, so that this is the same abstract 
graph as in the alternative visual representations 
1—4 3——2 
ed) Me, Thal 
3——2 1—4 
However, this graph is different from either of 
4—1 3——1 
h- ie 
a a a 
3——2 4—2 
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There are altogether Ga = 64 = 2° labelled graphs of size 4, i.e., comprising 4 nodes, in 
agreement with the general formula (see p. 97 for details): Gy, = gr—)/2 The labelled 
graphs can be grouped into equivalence classes up to arbitrary permutation of the labels, which 
determines the Cy = 11 unlabelled graphs of size 4. Each unlabelled graph corresponds to a 
variable number of labelled graphs: for instance, the totally disconnected graph (bottom, left) 
and the complete graph (top, right) correspond to 1 labelling only, while the line graph admits 
3 4! = 12 possible labellings. 


FIGURE II.1. Labelled versus unlabelled graphs for size n = 4. 


since, for instance, 1 and 2 are adjacent in h and j, but not in g. Altogether, there are 3 different 
labelled graphs (namely, g,h, 7), that have the same “shape”, corresponding to the unlabelled 
quadrangle graph 

ee 

Q= || 

ee 
Figure 1 lists all the 64 labelled graphs of size 4 as well as their 11 unlabelled counterparts 
viewed as equivalence classes of labelled graphs when labels are ignored. END OF EXAMPLE II.1. 


In order to count labelled objects, we appeal to exponential generating functions. 


Definition 11.2. The exponential generating function (EGF) of a sequence {A,,} is 
the formal power series 


(1) A(z) = 0 An 


n>0 


ge 

n! . 

The exponential generating function (EGF) of a class A is the exponential generating 
function of the numbers A, = card(A,). Equivalently, the EGF of class A is 


zlal 


A(z) = YAS = Daal 
ae 


n>0 
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It is also said that the variable z marks size in the generating function. 


With the standard notation for coefficients of series, the coefficient A,, in an exponen- 
tial generating function is then recovered by” 


An =n! -[z"] A(z), 


since [z"] A(z) = A,,/n! by the definition of EGFs and in accordance with the coeffi- 
cient extractor notation, Eq. (6) of Chapter I. 

Note that, like in the previous chapter, we adhere to a systematic naming con- 
vention for generating functions of combinatorial structures. A labelled class A, its 
counting sequence (A,,) (or (a@,)) and its exponential generating function A(z) (or 
a(z)) are all denoted by the same group of letters. 


Neutral and atomic classes. Like in the unlabelled universe, it proves useful to 
introduce a neutral (empty, null) object € that has size 0 and bears no label at all, and 
consider it as a special labelled object; a neutral class E is then by definition € = {e}. 
The (labelled) atomic class Z = {@} is formed of a unique object of size 1 that, being 
well-labelled, bears the integer label @. The EGFs of the neutral class and the atomic 
class are respectively 

E(z) =1, Z(z) =z. 
EXAMPLE II.2. Permutations. The class P of all permutations is prototypical of labelled 
classes. Under the linear representation of permutations, where 


o= 
C1, G2: 4 On 
is represented as the sequence (01, O2,--. ,On)s the class P is schematically 

@-@-@ 

o@ 80-8 

P _ €, @ ’ = ’ zs - ’ ’ 

@-®@ @-@-® 
@-@-@ 
@-@-@ 


so that Po = 1, Py 1, P2 = 2, P3 = 6, etc. There, by definition, all the possible orderings 
of the distinct labels are taken into account, so that the class P can be equivalently viewed as 
the class of all labelled linear digraphs (with an implicit direction, from left to right, say, in the 
representation). Accordingly, the class P of permutations has the counting sequence P,, = n! 
(argument: there are n positions where to place the element 1, then (n — 1) possible positions 
for 2, and so on). Thus the EGF of P is 


n a 1 
Pz) = ont = S02 a 


n>0 n>0 


Permutations, as they contain information relative to the order of their elements are essential in 
many applications related to order statistics. ................00- END OF EXAMPLE II.2. 


?Some authors prefer the notation [= JA() to n![z™] A(z), which we avoid in this book. Indeed, 
Knuth [305] argues convincingly that the variant notation is not consistent with many desirable properties 
of a “good” coefficient operator (e.g., bilinearity). 
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EXAMPLE II.3. Urns. The class U/ of totally disconnected graphs starts as 


se ® @||@ @] | 92 


Order between the labelled atoms does not count, so that for each n, there is only one possible 
arrangement and U;,, = 1. The class U/ can be regarded as the class of “urns”, where an 
um of size n contains n distinguishable balls in an unspecified (and irrelevant) order. The 
corresponding EGF is 
n 
U(z) = » 1 — = exp(z) =e”. 
n>0 

(The fact that the EGF of the constant sequence (1),,>o is the exponential function explains the 
term “exponential generating function”.) It also proves convenient, in several applications, to 
represent elements of an urn in a sorted sequence, which leads to an equivalent representation 
of urns as increasing linear graphs; for instance, 


®-@-@-@-® 
may be equivalently used to represent the urn of size 5. Though urns look trivial at first glance, 


they are of particular importance as building blocks of complex labelled structures (e.g., alloca- 
tions of various sorts), as we shall see shortly. .................. END OF EXAMPLE II.3. 


EXAMPLE II.4. Circular graphs. Finally, the class of circular graphs, where cycles are 
oriented in some conventional manner (say, positively here) is 


c= fo. Clery! 


Cyclic graphs correspond bijectively to cyclic permutations . One has Cp, = (n — 1)! (argu- 
ment: a directed cycle is determined by the succession of elements that “follow” 1, hence by a 
permutation of n — 1 elements). Thus, one has 


As we shall see in the next section, the logarithm is characteristic of circular arrangements of 
labelled‘objects:.4 a ascee.as cose ethane eed Sek eae aes END OF EXAMPLE II.4. 


> IL1. Labelled trees. Let Un, be now the number of labelled graphs with n vertices that are 
connected and acyclic; equivalently, U,, is the number of labelled unrooted nonplane trees. Let 
T;, be the number of labelled rooted nonplane trees. The identity T;, = nU,, is elementary, 
since all vertices in a labelled tree are distinguishable (by their labels) and a root can be chosen 
in n possible ways. In Section II. 5, we shall prove that U, = n"~? and T,, = n”~1. <q 


II. 2. Admissible labelled constructions 


We now describe a toolkit of constructions that make it possible to build complex 
labelled classes from simpler ones. Combinatorial sum or disjoint union is defined 
exactly as in Chapter I: it is the union of disjoint copies. To define a product that is 
adapted to labelled structures, we cannot use the cartesian product, since an ordered 
pair of two labelled objects is not well-labelled (for instance the label 1 would invari- 
ably appear repeated twice). Instead, we define a new operation, the labelled product, 
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which translates naturally into exponential generating functions. From there, simple 
translation rules follow for labelled sequences, sets, and cycles. 

Binomial convolutions. As a preparation to the translation of labelled construc- 
tions, we first briefly review the effect of products over EGFs. Let a(z), b(z), c(z) be 
EGFs, with a(z) = 7, Gnz”/n!, and so on. The binomial convolution formula is: 


(2) if a(z) = b(z) - c(z), then a, = S- (1) bkCn—k- 


k=0 
This formula results from the usual product of formal power series, 


On Di Cn—k pet n\ n! 
n! Se x kl! (n—k)! kk} ki(n—kyv 
In the same vein, if a(z) = a) (z) a) (z) --- a”) (z), then 


n 1),(2 r 
= aT » es NU, +++ a) an ta _ On 


ni tnet+nNp=Nn 


In Equation (3) there occurs the multinomial coefficient 


( n ) n! 
== ’ 
1, 12,..+5Mp ny!ng!---n,! 


which counts the number of ways of splitting nm elements into r distinguished classes 
of cardinalities n1,...,,. This property lies at the very heart of enumerative appli- 
cations of binomial convolutions and EGFs. 


II. 2.1. Labelled constructions. A labelled object may be relabelled. We only 
consider consistent relabellings defined by the fact that they preserve the order rela- 
tions among labels. Then two dual modes of relabellings prove important: 

— Reduction: For a weakly labelled structure of size n, this operation reduces 
its labels to the standard interval [1 ..n] while preserving the relative order 
of labels. For instance, the sequence (7,3, 9,2) reduces to (3,2,4,1). We 
use p(a) to denote the canonical reduction of the structure a. 

— Expansion: This operation is defined relative to a relabelling function e € 
[1..n] ++ Z that is assumed to be strictly increasing. For instance, (3, 2, 4, 1) 
may expand as (33, 22, 44, 11), (7,3, 9, 2), and so on. We use e(a) to denote 
the result of relabelling a by e. 

These notions enable us to devise a product suited to labelled objects. 

The labelled product, (or simply product), of objects and classes was originally 
formalized under the name of “partitional product” by Foata [217]. Given two labelled 
structures 3 € Bandy € C, this product noted as @ x y is a set comprised of the 
collection of well-labelled ordered pairs ((3’, 7’) that reduce to (@, y): 


(4) Bxy:={(6,7') | (6,7) is well-labelled, p(6’) = 8, p(y’) = 7}- 


An equivalent form is via expansion of labels: 


(5) Bxy = {(e(8), f(y) | Im(e)MIm(f) = 0, Im(e)UIm(f) = [1. .]5] + ln] }, 
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> w 
Gop xp Bxp Grp Srp 


FIGURE II.2. The 10 = (3) elements in the labelled product of a triangle and a segment. 


where e, f are relabelling functions with ranges Im(e),Im(/), respectively. Note that 
elements of a labelled product are, by construction, well-labelled. Figure 2 displays 
the labelled product of a particular object of size 3 with another object of size 2. 

The labelled product 3 x 7 of two elements (3, y of respective sizes n1, m2 is a set 
whose cardinality is, with n = n1 + ng, expressed as 


m++tne\ (Nn 
mine J \m/)’ 
since this quantity is the number of legal relabellings by expansion of the pair (3, y). 
(The example of Figure 2 verifies that the number of relabellings is indeed C) = 10.) 


If B and C are two labelled classes of combinatorial structures, the labelled prod- 
uct A = B x C is defined by the usual extension of operations to sets: 


(6) BxC= |J (Bx>). 


BEB, yEC 
In summary: 


Definition 11.3. The labelled product of B and C, denoted BxC, is obtained by forming 
ordered pairs from B x C and performing all possible order-consistent relabellings, 
ensuring that the resulting pairs are well labelled, as described by (4) or (5), and (6). 


Equipped with this notion, we can build sequences, sets, and cycles, in a way 
much similar to the unlabelled case. We proceed to do so and, at the same time, 
establish admissibility’ of the constructions. 


Labelled product. When A = B x C, the corresponding counting sequences sat- 
isfy the relation, 


7 +i . 
“ Ae De oat DN sign) DRO 


lal+lyl=n mitna=n 


The product B,,,Cp,, keeps track of all the possibilities for the 6 and C components 
and the binomial coefficient accounts for the number of possible relabellings, in accor- 
dance with our earlier discussion. The binomial convolution property (7) then implies 


3We recall that a construction is admissible (Chapter I) if the counting sequence of the result only 
depends on the counting sequences of the operands. An admissible construction therefore induces a well- 
defined transformation over exponential generating functions. 
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admissibility, 

A=BxC = A(z) = B(z)-C(z), 
with the labelled product simply translating into the product operation on EGFs. 
> I1.2. Multiple labelled products. The (binary) labelled product satisfies the associativity 
property, 

Bx(CxD) = (BxC)«D, 

which may serve to define B«C xD. The corresponding EGF is the product A(z)- B(z)-C(z). 
This product rule generalizes to r factors with coefficients given by a multinomial convolu- 


tion (3). J 


k-sequences and sequences. The kth (labelled) power of B is defined as (B x 
B.--B), with k factors equal to B. It is denoted SEQ; {8} as it corresponds to forming 
k—sequences and performing all consistent relabellings. The (labelled) sequence class 
of B is denoted by SEQ{B} and is defined by 


SEQ{B} := {e} + B+ (Bx B) + (B*BxB)+---= (J SEQ, {8}. 
k>0 


The product relation for EGFs extends to arbitrary products (Note 2), so that 
A = SEQ,(B) = A(z) = B(z)* 


A=SEQ(B) => A(z)=)° Blz)*= ETO 
k=0 


where the last equation requires By = 0). 


k-sets and sets. We denote by SET;,{B} the class of k-sets formed from B. The 
set class is defined formally, like in the case of the unlabelled multiset: it is the quotient 
SET; {B} := SEQ; {B}/R where the equivalence relation R identifies two sequences 
when the components of one are a permutation of the components of the other (p. 25). 
A “set” is like a sequence, but the order between components is immaterial. The 
(labelled) set construction applied to B, denoted SET{B}, is then defined by 


SET{B} w {e} + B+ SET2{B} 4+---= U SET; {B}. 
k>0 


A labelled k-set is associated with exactly k! different sequences. (In the unlabelled 
case, formule are more complex.) Thus in terms of EGFs, one has (assuming By = 0) 


A=Set;(B) => A(z) = —B(z)* 
A=SET(B) => A(z)=)> = B(2)* = exp(B(z)). 
k=0 


Note that the distinction between multisets and powersets that is meaningful for unla- 
belled structures is here immaterial: by definition components of a labelled set all have 
distinct labels so that, relative to the labelled universe, we have the correspondence: 
MSET, PSET ~ SET. 
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k-cycles and cycles. We also introduce the class of k-cycles, Cyc, {B} and the 
cycle class. The cycle class is defined formally, like in the unlabelled case, as the 
quotient CYC; {8B} := SEQ,{B6}/S where the equivalence relation S identifies two 
sequences when the components of one are a cyclic permutation of the components 
of the other (p. 24). A cycle is like a sequence whose components can be circularly 
shifted. In terms of EGFs, we have (assuming By = 0) 


A=Crcez.(By =: AG) = 7B(2)* 
A=Cyc(B) = A(z)=> = B(2)* = log a 
k=1 


since each cycle admits exactly k representations as a sequence. 
In summary: 


Theorem II.1. The constructions of combinatorial sum (disjoint union), labelled 
product, sequence, cycle and set are all admissible. The associated operators on 
EGFs are: 


Sum: A=B+C => A(z) = B(z)+C(z) 
Product: A=BxC = A(z) = B(z)-C(z) 
1 

Sequence: A=SEQ(B) = A(z)= 1- BG) 
—kcomp.: A= SEQ;(B) = (B)** = A(z) = B(z)* 
Set: A = SET(B) = A(z) = exp(B(z)) 
—kcomp.: A= SET;(B) = A(z)= = B(2)" 

j 1 
Cycle: A= Cyc(B) => A(z)= = 1- BG) 
—kcomp.: A=Cyc,;(B) = A(z)= 7 Ble)" 


Constructible classes. As in the previous chapter, we say that a class of labelled 
objects is constructible if it admits a specification in terms of sums (disjoint unions), 
the labelled constructions of product, sequence, set, cycle, and the initial classes de- 
fined by the neutral structure of size 0 and the atomic class Z = {@}. Regarding the 
elementary classes discussed in Section II. 1, it is immediately recognized that 


P =SEQ{Z}, U=SeET{zZ}, C=Cyc{Z}, 


specify permutations, urns, and circular graphs respectively. These constructions are 
basic building blocks out of which more complex objects can be constructed. In partic- 
ular, as we shall explain shortly (Section II. 3 and Section II. 4), set partitions (S), sur- 
jections (R), permutations under their cycle decomposition (P), and alignments (O) 
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are constructible classes corresponding to 


Surjections: R ~ SEQ{SETsi1{Z}} (sequences-of-sets), 
Set partititions: S ~ SET{SETs>1{Z}}  (sets-of-sets), 
Alignments: O ~ SEQ{Cyc{Z}}  (sequences-of-cycles). 
Permutations: P~SeET{Cyc{Z}},  (sets-of-cycles), 
An immediate consequence of Theorem II.1 is the fact that the EGF of a con- 


structible labelled class can be computed automatically. 


Theorem II.2. The exponential generating function of a constructible class of la- 
belled objects is a component of a system of generating function equations whose 
terms are built from | and z using the operators 


1 : 

=— _, E(f) =e, L(f) =1 

+15 Qf) = ap BUA) = ef, LUN) = log 

If we further allow cardinality restrictions in composite constructions, the operators 
f* (for SEQ;,), f*/k! (for SET;,), and f*/k (for CYC) are to be added to the list. 


II. 2.2. Labelled versus unlabelled enumeration. Any labelled class A has an 
unlabelled counterpart A: objects in A are obtained from objects of A by ignoring 
the labels. This idea is formalized by identifying two labelled objects if there is an 
arbitrary relabelling (not just an order-consistent one, as has been used so far) that 
transforms one into the other. For an object of size n, each equivalence class contains 
a priori between 1 and n! elements. Thus: 


Proposition I1.1. The counts of a labelled class A and its unlabelled counterpart A 
are related by 


n~ 


x A, 
(8) An < An <n!An or equivalently 1< 


<n. 

nr 
EXAMPLE II.5. Labelled and Unlabelled graphs. | This phenomenon has been already 
encountered in our discussion of graphs (Figure 1). Let generally G, and G,, be the number of 
graphs of size n in the labelled and unlabelled case respectively. One finds forn = 1..15 


Gn (unlabelled) Gn (labelled) 


2 2 

4 8 

11 64 

34 1024 

156 32768 

1044 2097152 

12346 268435456 

274668 68719476736 
12005168 35184372088832 
1018997864 36028797018963968 
165091172592 73786976294838206464 
5050203 1367952 30223 1454903657293676544 
29054155657235488 2475880078570760549798248448 


31426485969804308768 | 40564819207303340847894502572032 
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The sequence {Ga} constitutes EJS A000088, which can be obtained by an extension of meth- 
ods of Chapter I; see [259, Ch. 4]. The sequence {G7, } is determined directly by the fact that a 
graph of n vertices can have each of the (3) possible edges either present or not, so that 


Gy = 2G) = aren, 


The sequence of labelled counts obviously grows much faster than its unlabelled counterpart. 
We may then verify the inequality (8) in this particular case. The normalized ratios, 


fn =Gn/{Gn, on = Gn/(n!Gn), 


are observed to be 


On = Gn/(n!Gn) 
T.000000000 T.0000000000 
1.000000000 0.5000000000 
2.000000000 0.3333333333 
5.818181818 0.2424242424 
30.11764706 0.2509803922 
210.0512821 0.2917378918 
21742.70663 0.5392536367 
2930768.823 0.80764 13203 
446946830.2 0.9330800361 
0.8521603960 - 107 | 0.9774915111 
0.2076885783 - 107* | 0.9926428522 


From these data, it is natural to conjecture that 0, tends (fast) to 1 as n tends to infinity. This is 
indeed a nontrivial fact originally established by Pélya (see Chapter 9 of Harary and Palmer’s 
book [259] dedicated to asymptotics of graph enumerations): 


Cis i 9() = Gn 
n! n! 
In other words, “almost all” graphs of size n should admit a number of labellings close to n!. 
(Combinatorially, this corresponds to the fact that in a random unlabelled graph, with high 
probability, all of the nodes can be distinguished based on the adjacency structure of the graph; 
in such a case, the graph has no nontrivial automorphism and the number of distinct labellings 


is*nexactly.)* cs cece dees aan thee Toe ee een waked Take END OF EXAMPLEIL.5. 


The case of urns and totally disconnected graphs resorts to the other extreme 

situation where 
Un =Un =1. 

The examples of graphs and urns illustrate the fact that, beyond the general bounds of 
Proposition II.1, there is no automatic way to translate between labelled and unlabelled 
enumerations. At least, if the class A is constructible, its unlabelled counterpart A can 
be obtained by interpreting all the intervening constructions as unlabelled ones in the 
sense of Chapter I (with SET +> MSET), both generating functions are computable, 
and their coefficients can be compared. 
> IL.3. Permutations and their unlabelled counterparts. The labelled class of permutations can 
be specified by P = SEQ(Z); the unlabelled counterpart is the set P of integers in unary nota- 


tion, and P,, = 1, so that P,, = n!-P, exactly. The specification P’ = SET(CYc(Z)) describes 
sets of cycles and, in the labelled universe, one has P’ ~ P; however the unlabelled counter- 


part of P’ is the class pl x P of integer partitions examined in Chapter I. [In the unlabelled 
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universe, there are special combinatorial isomorphisms like: SEQ,(Z) = MSET>1(Z) = 
Cyc(Z). In the labelled universe, the identity SETo Cyc = SEQ holds. ] dq 


II. 3. Surjections, set partitions, and words 


This section and the next are devoted to what could be termed level-two nonrecur- 
sive structures defined by the fact that they combine two constructions. In this section, 
we discuss surjections and set partitions (Section II.3.1), which constitute labelled 
analogues of integer compositions and integer partitions in the unlabelled universe. 
The symbolic method then extends naturally to words over a finite alphabet, where 
it opens access to an analysis of the frequencies of letters composing words. This 
in turn has useful consequences for the study of some classical random allocation 
problems, of which the birthday paradox and the coupon collector problem stand out 
(Section II. 3.2). 


II. 3.1. Surjections and set partitions. We examine classes 
R = SEQ{SETs1{Z}} and S = SET{SETs1{Z}}, 

corresponding to sequences-of-sets (72) and sets-of-sets (S), or equivalently, sequences 
of urns and sets of urns, respectively. Such abstract specifications model very classical 
objects of discrete mathematics, namely surjections (R) and set partitions (S) 

Surjections with r images. In elementary mathematics, a surjection from a set A 
to a set B is a function from A to B that assumes each value at least once (an onto 
mapping). Fix some integer r > 1 and let RY denote the class of all surjections from 
the set [1..n] onto [1..7r] whose elements are also called r—surjections. Here is a 
particular object ¢ € RY): 


1 2 4 5 6 7 8 
(9) ne AOR ee 
1 2 3 4 5 


(Note that, if 6(9) were 3, then ¢ would not be a surjection.) We set R‘”) = U,, RY) 
and proceed to compute the corresponding EGF, R“”) (z). First, let us observe that an 
r—surjection @ € RY is determined by the ordered r—-tuple formed with the collection 
of all preimage sets, ($~'(1),¢71(2),...,¢7*(r)), themselves disjoint nonempty 
sets of integers that cover the interval [1..n]. In the case of the surjection @ of (9), 
this alternative representation is 


@: — ({2}, {1,3}, {4,6,8}, {9}, {5,7} )- 
One has the combinatorial specification and EGF relation: 
(10) R) = Sze,{V}, V=SETsi{Z} =  RM(z)=(e%-1). 


There V = U \ {e} designates the class of urns (/) that are nonempty, with EGF 
V(z) = e* — 1, in view of our earlier discussion of urns. In words: “a surjection is a 
sequence of nonempty sets”. See Figure II. 3.1 for an illustration. 
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1 2 3 4 5 6 7 8 9 
| 2 3 4 5 
1 2 3 4 5 6 7 8 9 
2 t 2 3 5 3 5 3 4 


| i fF ft 2 |. 3. | 
e o “o 


Oo @® cp 


[ {2}, {1, 3}, {4, 6, 8}, {9}, {5,7} ] 


FIGURE II.3. The decomposition of surjections as sequences-of-sets: a surjection given 
by its graph (top), its table (second line), and its sequence of preimages (bottom lines). 


Equation (10) does solve the counting problem for surjections. For small r, one 
finds 


R)(z) = e?* — Qe? +1, R®)(z) = e3* — 3e?* + 3e7 — 1, 
whence, by expanding, 
RS =0. Raa BIg, 


The general formula follows similarly from expanding the rth power in (10) by the 
binomial theorem, and then extracting coefficients: 


yf 
QQ RM =n”) (") (-1)7e"-)? = > (") (-1))(r — 5)”. 
joo j=0 
> II.4. A direct derivation of the surjection EGF. One can verify the result provided by the 


symbolic method by returning to first principles. The preimage of value 7 by a surjection is a 
nonempty set of some cardinality n; > 1, so that 


12 () = . 
2) Rn > Grae , 
(n1,n2,--,2r) 


the sum being taken over nj; > 1, n1 + n2+---+n, = n. Introduce the numbers V,, := 
[n > 1]. The formula (12) then assumes the simpler form 


n 
13 R= Vag Veg 20° Vis 
a) ae eee) Be m 
Ty 41g 46..,2p 
where the summation now extends to all tuples (m1, n2,..., mr). The EGF of the V, is V(z) 


Yo Vnz"/n! = e* — 1. Thus the convolution relation (13) leads again to (10). <i 
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Set partitions into r blocks. Let s© denote the number of ways of partitioning 
the set [1 ..n] into r disjoint and nonempty equivalence classes also known as blocks. 
We set S() = Un SW; the corresponding objects are called set partitions (the latter 
not to be confused with integer partitions examined in Section I. 3). The enumeration 
problem for set partitions is closely related to that of surjections. Symbolically, a 
partition is determined as a labelled set of classes (blocks), each of which is a non- 
empty urn. Thus, one has 

1 : 
(14) S = SEt,{V}, V=SETs1{Z} => SM(z)=—(e?-1)’. 
= r! 
The basic formula connecting the two counting sequences is, in accordance with (10) 
and (14), 


5 = LR 


This can be interpreted directly along the lines of the proof of Theorem II.1: an r— 
partition is associated with a group of exactly r! distinct r—surjections, two surjections 
belonging to the same group iff one obtains from the other by permuting the range 
values, [1..7']. 

The numbers S\”” = n![z"]$) (z) are known as the Stirling numbers of the sec- 
ond kind, or better, the Stirling “partition” numbers. They were briefly encountered 
in the previous chapter and discussed in connection with encodings by words (Chap- 
ter I, p. 59). Knuth, following Karamata, advocated for the gf) the notation ey. 
From (11), an explicit form also exists: 


(15) soy {"} - ps (") CE 


The books by Graham, Knuth, and Patashnik [248] and Comtet [98] contain a thor- 
ough discussion of these numbers; see also APPENDIX A: Stirling numbers, p. 680. 


All surjections and set partitions. Define now the collection of all surjections 


and all set partitions by 
Ra|(JRO salsm: 


Thus ?,, is the class of all surjections of [1 ..] onto any initial segment of the inte- 
gers, and S,, is the class of all partitions of the set [1 ..] into any number of blocks 
(Figure 4). Symbolically, one has 


(16) = 2 — e% 
S =SET(SET>:{Z}) => S(z) =e). 


The numbers R, = n! [z”]R(z) and S, = n![z"]S(z) are called surjection num- 
bers (also, “preferential arrangements” numbers, EJS A000670) and Bell numbers 
(EIS 4000110) respectively. These numbers are well determined by expanding the 
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© 


FIGURE II.4. A complete listing of all set partitions for sizes n = 1,2,3,4. The 
corresponding sequence 1, 1,2, 5,15,... is formed of Bell numbers, E7S A000110. 


EGFs: 
2 3 4 5 6 7 
Riz) = 1t+z 3+ 13 + 75 — +541 — + 4683 — + 47293 +--. 
ir ar nr re eee 
Siz) = 142425455415 +52 4203 +877 +. 


Explicit expressions as finite double sums result from summing Stirling numbers, 
n n 
in »F we and Sy, yb 
r>0 r>0 


where each Stirling number is itself a sum given by (15). Alternatively, single (though 
infinite) sums result from the expansions 


1 1 1 


R =—- £5 = e-—1 ys e 
(z) 2 1 Le ; S(z) e€ . =e 
— i kz a = 1 _ ef 
= geri © = 3 by ae? 
L=0 £=0 


il i. << 
Fn = 3 dig md Sm = 
£=0 £=0 


The formula for Bell numbers was found by Dobinski in 1877. 

The asymptotic analysis of the surjection numbers (R,,) will be performed in 
Chapter IV as one of the very first illustrations of complex asymptotic methods (the 
meromorphic case); that of Bell’s partition numbers is best done by means of the 
saddle point method exposed in Chapter IX. The asymptotic forms found are 


n! 1 eal 


17 ee | en eT 
1) 2 (log 2)n+1 r(n)?*t 2n exp(r(n)) 
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where r(n) is the positive root of the equation re” = n. One has r(n) ~ logn — 
log log n, so that 


log S, = n (logn — loglogn — 1+ o(1)). 


Elementary derivations (i.e., based solely on real analysis) of these asymptotic forms 
are also possible as discussed briefly in APPENDIX B: Laplace’s method, p. 700. 
The line of reasoning adopted for the enumeration of surjections viewed as sequences- 
of-sets and partitions viewed as sets-of-sets yields a general result that is applicable to 
a wide variety of constrained objects. 


Proposition II.2. Let R‘4:8) be the class of surjections where the cardinalities of 
the preimages lie in A C Zs, and the cardinality of the range belongs to B. The 
corresponding EGF is 


RAPN(z) = Blalz)) where a(z)= OF, Bl = De. 
acA beB 


Let S‘4:) be the class of set partitions with part sizes in A C Z> 1 and with a 
number of blocks that belongs to B. The corresponding EGF is 


SAP(2) = Blalz)) where a(2)= 05, =F. 


PROOF. One has 
RA) — Srap{SeTa{Z}} and  S'4-8) = Serp{Sera{Z}}, 


where, as usual, the subscript X specifies a construction with a number of components 
restricted to the integer set X. 


EXAMPLE II.6. Smallest and largest blocks in set partitions. Let ey(z) denote the truncated 


exponential function, 


z 2 2 


7 + oT Sia nk 
The EGFs $‘S?) (z) = exp(es(z) — 1) and S‘>” (z) = exp(e* — eo(z)), correspond to parti- 


tions with all blocks of size < b and all blocks of size > b, respectively. END OF EXAMPLE II.6. 


ep(z) = 14+ 


[> ILS. No singletons. The EGF of partitions without singleton parts is e© ~'~*. The EGF of 
“double surjections” (each preimage contains at least two elements) is (2 + z — e”)~*. J 


EXAMPLE II.7. Comtet’s square. An exercise in Comtet’s book [98, Ex. 13, p. 225] serves 
beautifully to illustrate the power of the symbolic method. The question is to enumerate set 
partitions such that a parity constraint is satisfied by the number of blocks and/or the number of 
elements in each block. Then, the EGFs are tabulated as follows: 


Set partitions Any # of blocks Odd#of blocks Even # of blocks 
Any block sizes | e° ~! sinh(e* — 1) cosh(e* — 1) 
Odd block sizes | e*""* sinh(sinh z) cosh(sinh z) 


Even block sizes | e°°?*~1 sinh(coshz—1) cosh(cosh z — 1) 


II. 3. SURJECTIONS, SET PARTITIONS, AND WORDS 103 


The proof is a direct application of Proposition II.2, upon noting that 
e’, sinhz, coshz 


are the characteristic EGFs of Z>o9, 2Z>9 + 1, and 2Z>0 respectively. The sought EGFs are 
then obtained by forming the compositions 


exp —1l+exp 
sinh }o sinh ; 
cosh —1+ cosh 


in accordance with general principles. ..................000 eee END OF EXAMPLE II.7. 


II. 3.2. Applications to words and random allocations. Numerous enumera- 
tive problems present themselves when analysing statistics on letters in words. They 
find applications in the study of random allocations and the design of hashing algo- 
rithms of computer science [434]. Fix an alphabet 


X = {a1,a9,...,a,} 


of cardinality r, and let W be the class of all words over the alphabet 1’, the size 
of a word being its length. A word of length n, w € Wp, is an unconstrained 
function from [1..n] to [1..7], the function associating to each position the value 
of the corresponding letter in the word (canonically numbered from 1 to r). For 
instance, let ¥ = {a,b,c,d,r} and take the letters of V canonically numbered as 
a, = a@,...,@5 = 1; for the word w = ‘abracadabra’, the table giving the position- 
to-letter mapping is 


a br 
1 2 8 
1 2 5 


which is itself determined by its sequence of preimages: 


Re AS 


c adaeobeor a 
5 6 7 8 9 10 Ii ; 
3 1 4 1 2 ~5 1 


a=a, b=a2 c=a3 d=a4 r=a5 
{1,4,6,8,11}, {2,9}, {5}, {7}, {3,10}. 


(In this particular case, all preimages are nonempty, but this need not always the case.) 
The decomposition based on preimages then gives, with U/ the class of all urns 


(18) W ~U" = SEQ,{U} => W(z) = (e*)" =e", 


which yields back W,, = r”, as was to be expected. In summary: words over an r-ary 
alphabet are equivalent to functions into a set of cardinality r and are described by an 
r-fold labelled product. 

For the situation where restrictions are imposed on the number of occurrences of 
letters, the decomposition (18) generalizes as follows. 


Proposition II.3. Let W4) denote the family of words such that the number of oc- 
currences of each letter lies ina set A. Then 


(19) WA) (z) = a(z)" where a(z) = ye saity 
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The proof is a one-liner: W'4) © SEQ,.(SET4(Z)). Though this result is tech- 
nically a shallow consequence of the symbolic method, it has several important appli- 
cations in discrete probability; see [434, Ch. 8] for a discussion along the lines of the 
symbolic method. 


EXAMPLEII.8. Restricted words. The EGF of words containing at most b times each letter, 
and that of words containing more than b times each letter are 


(20) W'S") (z) = (ea(z))”, WP" (z) = (e* — ex(z))", 


respectively. (Observe the analogy with Example 6.) Taking b = 1 in the first formula gives the 
number of n—arrangements of r elements (i.e., of ordered combinations of n elements amongst r 
possibilities), 


(21) n! enray =a") =r(r—1)-:-(r—n+1), 


as anticipated; taking b = 0, but now in the second formula, gives back the number of r- 
surjections. For general b, the generating functions of (20) contain valuable information on the 


least frequent and most frequent letter in random words. ........ END OF EXAMPLE II.8. 


EXAMPLEII.9. Random allocations (balls-in-bins model). Throw at random n distinguishable 
balls into m distinguishable bins. A particular realization is described by a word of length n 
(balls are distinguishable, say, as numbers from 1 to n) over an alphabet of cardinality m (rep- 
resenting the bins chosen). Let Min and Max represent the size of the least filled and most filled 
bins, respectively. Then‘, 


Zz m 
P{Max <b} = n![z"]Jeo (=) 
(22) : sees 
“boa = te (2 (3))" 
{Max > b} n![z"] (e eo (— 
The justification of this formula relies on the easy identity 
1 z 
23 — [z” = [2"|f(— 
(23) — (2"|f(2) = e"1(=), 


and on the fact that a probability is determined as the ratio between the number of favourable 
cases (given by (20)) and the total number of cases (m”). The formule of (22) lend themselves 
to evaluation using symbolic manipulations systems; for instance, with m = 100 and n = 200, 
one finds for P(Max = k), where k = 2,4,5,..., the values: 


2 4 5 6 7 8 9 12 15 20 
10> 1.4-10-* 0.17 0.46 0.26 0.07 0.01 9.2-10°> 23-10" 4.7-10°™ 


The values k = 5,6, 7, 8 concentrate about 99% of the probability mass. 

An especially interesting case is when m and n are asymptotically proportional, that is, 
n/m = a and a lies in a compact subinterval of (0,-+oo). In that case, with probability 
tending to 1 as n tends to infinity, one has 


1 

Min = 0, Max ~ aa 
log log n 

In other words, there are almost surely empty urns (in fact many of them, see Example 9 in 

Chapter IIT) and the most filled urn grows logarithmically in size. Such probabilistic prop- 


erties are best established by complex analytic methods (especially the saddle point method 


4We let P(E) represent the probability of an event E and E(X) the expectation of the random vari- 
able X; cf APPENDIX C: Random variables, p. 717. 
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detailed in Chapter VIII) based on exact generating representations like (20) and (22). They 
form the core of the reference book [316] by Kolchin, Sevastyanov, and Chistyakov. The re- 
sulting estimates are in turn invaluable in the analysis of hashing algorithms [242, 307, 434] 
to which the balls-in-bins model has been recognized to apply with great accuracy [347]. 
END OF EXAMPLE II.9. 


> IL.6. Number of different letters in words. The probability that a random word of length n 
over an alphabet of cardinality r contains k different letters is 


rr) 1 /r\Jrt,, 
ee i se (;) ihe 


(Choose & letters amongst r, then split the n positions into k distinguished nonempty classes.) 
The quantity py is also the probability that a random mapping from [1.. 7] to [1..7] has an 
image of cardinality k. 

> IL.7. Arrangements. An arrangement of size nis an ordered combination of (some) elements 
of [1..mJ]. Let A be the class of all arrangements. Grouping together all the possible elements 


not present in the arrangement into an urn shows that a specification and its companion EGF 
are 


A~UuxP,U = SET{Z}, P = SEQ{Z} => A(z) = 


1-—z 
The counting sequence A, = povare m starts as 1, 2,5, 16, 65, 326, 1957 (EIS A000522); see 
also Comtet [98, p. 75]. J 


Birthday paradox and coupon collector problem. The next two examples illus- 
trate applications of EGFs to two classical problems of probability theory, the birthday 
paradox and the coupon collector problem. Assume that there is a very long line of 
persons ready to enter a very large room one by one. Each person is let in and de- 
clares her birthday upon entering the room. How many people must enter in order 
to find two that have the same birthday? The birthday paradox is the counterintuitive 
fact that on average a birthday collision takes place as early as n = 24. Dually, the 
coupon collector problem asks for the average number of persons that must enter in 
order to exhaust all the possible days in the year as birthdates. In this case, the an- 
swer is the rather large number n’ = 2364. The term “coupon collection” alludes 
to the situation where images or coupons of various sorts are inserted in sales items 
and some premium is given to those who succeed in gathering a complete collection. 
The birthday problem and the coupon collector problem are relative to a potentially 
infinite sequence of events; however, the fact that the first birthday collision or the 
first complete collection occurs at any fixed time n only involves finite events. The 
following diagram illustrates the events of interest: 


n=0 B (1st collision) C (complete collection) 


=== na aannIIneLIne EL GEARREEE ae n — +00 
INJECTIVE SURJECTIVE 


In other words, we seek the time at which injectivity ceases to hold (the first birthday 
collision, B) and the time at which surjectivity begins to be satisfied (a complete 
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collection, C’). In what follows, we consider a year with r days (readers from earth 
may take r = 365) and let V represent an alphabet with r letters (the days in the year). 


EXAMPLEII.10. Birthday paradox. Let B be the time of the first collision, which is a random 
variable ranging between 2 and r + 1 (where the upperbound derives from the pigeonhole prin- 


ciple). A collision has not yet occurred at time n, if the sequence of birthdates G1,..., Gn has 
no repetition. In other words, the function ( from [1..n] to must be injective; equivalently, 
(1,..., @n is an n-arrangement of r objects. Thus, we have the fundamental relation 
—1)---(r— 1 
P{B>n} alee eu), 
rr 
| 

(24) = Spe (1+2)" 

re 


= n[z"] (1 + =) ; 
- 
where the second line repeats (21) and the third results from the series transformation (23). 
The expectation of the random variable B is elementarily 


(25) E(B) = P{B>n}, 
n=0 


this by virtue of a general formula valid for all discrete random variables (APPENDIX C: Ran- 
dom variables, p. 717). From (24), line 1, this gives us a sum expressing the expectation, 
namely, 


ais a ey) 


For instance, with r = 365, one finds that the expectation is the rational number, 


_ 12681 -- - 06674 
es 51517 -- - 40625 
where the denominator comprises as much as 864 digits. 
An alternative form of the expectation derives from the generating function involved in (24), 
line 3. Let f(z) = )¢,, fnz” be an entire function with nonnegative coefficients. Then the for- 
mula 


~ 24.61658, 


(26) > fant = / * e-* f(t) at, 
n=0 0 


is valid provided either the sum or the integral on the right converges. The reason is the usual 
Eulerian representation of factorials, 


co 
al = f e ‘t” dt. 
0 


Applying this principle to (25) with the probabilities given by (24) (third line), one finds 


(27) E(B) = ie e* (1 + *) dt. 
0 r 


This last form is easily amenable to asymptotic analysis and the Laplace method? (see APPEN- 
DIX B: Laplace’s method, p. 700) provides the estimation 


(28) E(B) = fet stort”), 


*Knuth [306, Sec. 1.2.11.3] uses this calculation as a pilot example for (real) asymptotic analysis; the 
quantity E(B) is related to Ramanujan’s @-function (see also Eq. (45) below) by E(B) = 1+ Q(r). 
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20) 


(letter chosen) 104 


L (time of arrival) 


on 20 40 60 80 
FIGURE II.5. A sample realization of the “birthday paradox” and “coupon collection” 


with an alphabet of r = 20 letters. The first collision occurs at time B = 6 while the 
collection becomes complete at time C' = 87. 


as r tends to infinity. For instance, the asymptotic approximation given by the first two terms 
of (28) is 24.61/19, which represents a relative error of only 2 - 107+. 

The interest of such integral representations based on generating function is that they are 
robust: they adjust naturally to many kinds of combinatorial conditions. For instance, the ex- 
pected time necessary for the first occurrence of the event “b persons have the same birthday” 
is found to have expectation given by the integral 


(29) I(r,b) := ? e €p-1 (+) dt. 
0 r 


(The basic birthday paradox corresponds to b = 2.) The formula (29) was first derived by 
Klamkin and Newman in 1967; their paper [293] shows in addition that 


1 _ 
I(r,b) ~ Vor (1 a i) pre 
where the asymptotic form evaluates to 82.87 for r = 365 and b = 3, while the exact 
value of the expectation is 88.73891. Thus three-way collisions also tend to occur much 
sooner than one might think, with about 89 persons on average. Globally, such develop- 


ments illustrate the versatility of the symbolic approach to many basic probabilistic problems. 
END OF EXAMPLEII.10. 


> IL.8. The probability distribution of time till a birthday collision. Elementary approximations 
show that, for large r, and in the “central” regime n = tyr, one has 


P(B>t¥r)~e*?, = P(B=tyr)~ sae 


The continuous probability distribution with density te-®/? is called a Rayleigh distribution. 


Saddle point methods (Chapter VIII) may be used to show that for the first occurrence of a 
b-fold birthday collision: P(B > tr1~¥/°) ~ ee <q 
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EXAMPLE II.11. Coupon collector problem. This problem is dual to the birthday paradox. 
We ask for the first time C' when (1,...,@c contains all the elements of 1, that is, all the 
possible birthdates have been “collected”. In other words, the event {C_ < n} means the 
equality between sets, {(G1,..., Gn} =. Thus, the probabilities satisfy 


(r) 1f[r 
P{C<n} = aa = rite 
if r 
(30) = = [2"] (ea) 


= nz") (e*" _ 1) : 
by our earlier enumeration of surjections. The complementary probabilities are then 
P{C>n}=1-—P{C <n} =nl[z"| (e*- (e*/" -1) ie 


An application of the Eulerian integral trick of (27) then provides a representation of the expec- 
tation of the time needed for a full collection as 


BL) E(C) = a (1 == es) dt. 


A simple calculation (expand by the binomial theorem and integrate termwise) shows that 


E(C) =r 3 (") a 


which constitutes a first answer to the coupon collector problem in the form of an alternating 
sum. Alternatively, in (31), perform the change of variables v = 1 — e~*/", then expand and 
integrate termwise; this process provides the more tractable form 


(32) E(C) = rH,, 
where H, is the harmonic number: 
1 1 1 
He Sia es Se 
+ 5 + 3 + + 7 


Formula (32) is by the way easy to interpret directly®: one needs on average 1 = 1/r trials to 
get the first day, then r/(r — 1) to get a different day, etc. 

Regarding (32), one has available the well-known formula (by comparing sums with inte- 
grals or by Euler-Maclaurin summation), 


1 2, 
Hy =logr+y+5-+O(r-*), y= 0.57721 56649, 


where + is known as Euler’s constant. Thus, the expected time for a full collection satisfies 
1 = 
(33) E(C) =rlogr +yr+5+O(r *): 


Here the “surprise” lies in the nonlinear growth of the expected time for a full collection. For 
a year on earth, r = 365, the exact expected value is = 2364.64602 while the approximation 
provided by the first three terms of (33) yields 2364.64625, representing a relative error of only 
one in ten millions. 


Such elementary derivations are very much problem specific: contrary to the symbolic method, they 
do not usually generalize to more complex situations. 
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As usual, the symbolic treatment adapts to a variety of situations, for instance, to multiple 
collections. The expected time till each item (birthday or coupon) is obtained 6 times (the 
standard case corresponds to b = 1) equals the quantity 


I(r,b) = yA (1 2 (1 = ev-i(t/rje/”)") dt, 


0 
an expression that vastly generalizes (31). From there, one finds [372] 


J(r,b) =r (logr + (b — 1) log logr + y — log(b — 1)! + o(1)), 
so that only a few more trials are needed in order to obtain additional collections. END OF EXAMPLE II.11. 


> IL.9. The little sister. The coupon collector has a little sister to whom he gives his duplicates. 
Foata, Lass, and Han [218] show that the little sister misses on average H, coupons when her 
big brother first obtains a complete collection. dq 


> IL.10. The probability distribution of time till a complete collection. The saddle point method 
(Chapter VIII) may be used to prove that, in the regime n = r logr + tr: 


lim P(C < rlogr +tr) = eo 

too 
This continuous probability distribution is known a double exponential distribution. For the 
time C) till a collection of multiplicity b, one has 


lim P(C® <rlogr + (b— 1)rloglogr + tr) = exp(—e~‘/(b— 1)!), 


too 
a property known as the Erd6és-Rényi law, which finds applications in the study of random 
graphs [154]. dq 
Words as both labelled and unlabelled objects. What distinguishes a labelled 
structure from an unlabelled one? There is nothing intrinsic there, and everything is in 
the eye of the beholder—or rather in the type of construction adopted when modelling 
a specific problem. Take the class of words WV over an alphabet of cardinality r. The 
two generating functions (an OGF and an EGF respectively), 
oe as 1 _ ze ore 
W(z)= y Wrz” = and W(z)= ye Wn =e, 


~ la=rz 


leading in both cases to W,, = r”, correspond to two different ways of constructing 
words: the first one directly as an unlabelled sequence, the other one as a labelled 
power of letter positions. A similar situation arises for r—partitions, for which we 
found as OGF and EGF, 
35 r a 1)" 
1) ———— 1. gO@se=ut 
OW Tantamai=m (2) 
by viewing these either as unlabelled structures (an encoding via words of a regular 
language, see Section 1.4.3) or directly as labelled structures. 


r! ‘ 


> IL11. Balls switching chambers: the Ehrenfest” model. Consider a system of two chambers 
A and B (also classically called “urns”). There are N distinguishable balls, and, initially, 
chamber A contains them all. At any instant 3, 3, ..., one ball is allowed to change from one 
chamber to the other. Let El ] be the number of possible evolutions that lead to chamber A 
containing @ balls at instant n and EF (4 (z) the corresponding EGF. Then 


EM (2) = G (cosh z)‘(sinh z)‘~*, E')(z) = (cosh z)% = 27% (e7 +e77)%. 
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[Hint: the EGF E!! enumerates mappings where each preimage has an even cardinality.] In 
particular the probability that urn A is again full at time 2n is 


— oe (7) (N — 2k)”. 


k=0 


This famous model was introduced by Paul and Tatiana Ehrenfest [148] in 1907, as a simplified 
model of heat transfer. It helped resolve the apparent contradiction between irreversibility in 
thermodynamics (the case N — oo) and recurrence of systems undergoing ergodic transforma- 
tions (the case N < oo). See especially Mark Kac’s discussion [288]. The analysis can also 
be carried out by combinatorial methods akin to those of weighted lattice paths: see Note V.22, 
p. 313 and [245]. J 


II. 4. Alignments, permutations, and related structures 


In this section, we start by considering specifications built by piling up two con- 
structions, sequences-of-cycles and sets-of-cycles respectively. They define a new 
class of objects, alignments, while serving to specify permutations in a novel way 
as detailed below. (These specifications otherwise parallel surjections and set parti- 
tions.) Permutations are in this context examined under their cycle decomposition, 
the corresponding enumerative results being the most important ones combinatorially 
(Subsection II. 4.1). In Subsection II. 4.2, we recapitulate the meaning of classes that 
can be defined iteratively by a combination of any two nested labelled constructions. 


II. 4.1. Alignments and Permutations. The two specifications under consider- 
ation here are 


(34) O=SEQ{Cyc{Z}}, and P=SeEtT{Cyc{Z}}, 
defining new objects called alignments (O) and an important decomposition of per- 


mutations (P). 


Alignments. An alignment is a well-labelled sequence of cycles. Let O be the 
class of all alignments. Schematically, one can visualize an alignment as a collection 
of directed cycles arranged in a linear order, somewhat like slices of a sausage fastened 
on a skewer: 


The symbolic method provides, 


1 
— Zz =eE=_ 
O = SEQ{Cyc{Z}} => O(z) (bata 
and the expansion starts as 
a ge a BP 
Ofz)=1+2+35 + 4z + 887 + 6947 +: ; 
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O00 


A permutation may be viewed as a set of cycles that are labelled circular digraphs. The diagram 
shows the decomposition of the permutation 


—f 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 
7=\ 11 12 13 17 1015 1493 4 6 27 8 1 5 16)° 


(Cycles read here clockwise and 7 is connected to o; by an edge in the graph.) 


FIGURE II.6. The cycle decomposition of permutations. 


but the coefficients (EJS A007840: “ordered factorizations of permutations into cy- 
cles”) appear to admit of no simple form. 


Permutations and cycles. From elementary mathematics, it is known that a per- 
mutation admits a unique decomposition into cycles. Let g = 01...» be a permu- 
tation. Start with any element, say 1, and draw a directed edge from | to o(1), then 
continue connecting to o?(1),0°(1), and so on; a cycle containing | is obtained after 
at most n steps. If one repeats the construction, taking at each stage an element not yet 
connected to earlier ones, the cycle decomposition of the permutation o is obtained. 
This argument shows that the class of sets-of-cycles (corresponding to P in (34)) is 
isomorphic to the class of permutations as defined in Section II. 1: 


P = Set{Cyc{Z}} & SEQ{Z}. 


This combinatorial isomorphism is reflected by the obvious series identity 


1 1 
P(z) =exp (10x —) rae 


The property that exp and log are inverse of one another is an analytic reflex of the 
combinatorial fact that permutations uniquely decompose into cycles! 

As regards combinatorial applications, what is especially fruitful is the variety of 
specializations of the construction of permutations from cycles. We state: 


Proposition I1.4. Let P‘4®) be the class of permutations with cycle lengths in A C 
Zo and with a number of cycles that belongs to B © Zo. The corresponding EGF 
is 


gt z 
POP) (2) = Bla(z)) where a(z)= I, B= DoF 
acA beEB 


EXAMPLE IJ.12. Stirling cycle numbers. The number of permutations of size n comprised 
of r cycles is determined by the explicit generating function, to the effect that 


(7p) Peta 1 7 
(35) Pye A [2"] (108 +) : 
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These numbers are fundamental quantities of combinatorial analysis. They are known as the 
Stirling numbers of the first kind, or better, according to a proposal of Knuth, the Stirling cycle 
numbers. Together with the Stirling partition numbers, the properties of the Stirling cycle num- 
bers are explored in the book by Graham, Knuth, and Patashnik [248] where they are denoted 
by SB See APPENDIX A: Stirling numbers, p. 680. (Note that the number of alignments 
formed with r cycles is r! [”] .) As we shall see shortly (p. 130) Stirling numbers also surface in 
the enumeration of permutations by their number of records. 

It is also of interest to determine what happens regarding cycles in a random permutation of 
size n. Clearly, when the uniform distribution is placed over all elements of P,,, each particular 
permutation has probability exactly 1/n!. Since the probability of an event is the quotient of 
the number of favourable cases over the total number of cases, the quantity 


£2 1 jn 
Prk = Tb 


is the probability that a random element of P,, has k cycles. This probabilities can be effectively 
determined for moderate values of n from (35) by means of a computer algebra system. Here 
are for instance selected values for n = 100: 
k: 1 2 3 4 5 6 7 8 9 10 
“pne: 0.01 0.05 0.12 0.19 0.21 0.17 O11 0.06 0.03 0.01 ° 


For this value n = 100, we expect in a vast majority of cases the number of cycles to be in the 
interval [1, 10]. (The residual probability is only about 0.005.) Under this probabilistic model, 
the mean is found to be about 5.18. Thus: A random permutation of size 100 has on average a 
little more than 5 cycles; it rarely has more than 10 cycles. 

Such procedures demonstrate a direct exploitation of symbolic methods. They do not 
however tell us how the number of cycles could depend on n as n varies. Such questions are to 
be examined systematically in Chapter II. Here, we shall content ourselves with a brief sketch. 
First, form the bivariate generating function, 


and observe that 


P(z,u) 


| 
Me 
2/5 
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eo) 
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= (l-—2z)™. 


Newton’s binomial theorem then provides 


[e"]( — 2) = (-1)" (;"). 


In other words, a simple formula 


n in ;: 
36 uw =ulut1)(ut+2)---(utn-1 
(36) dX A (u+1)(u+2)---( ) 
encodes precisely all the Stirling cycle numbers corresponding to a fixed value of n. From there, 
the expected number of cycles, in := )>), kpn,r is easily found (use logarithmic differentiation 
of (36)), 
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In particular, one has 100 = H1io0 = 5.18738. In general: The mean number of cycles in a ran- 
dom permutation of size n grows logarithmically with n, pin ~ logn. END OF EXAMPLE I.12. 


EXAMPLE IJ.13. Involutions and permutations without long cycles. A permutation o is an 
involution if 0? = Id with Id the identity permutation. Clearly, an involution can have only 
cycles of sizes 1 and 2. The class Z of all involutions thus satisfies 


2 
(37) T=Set{Cyci2{Z}} => I(z)=exp (: i =) 
The explicit form of the EGF lends itself to expansion, 
[n/2] i 
i ——_____.,, 


which solves the counting problem explicitly. A pairing is an involution without fixed point. 
In other words, only cycles of length 2 are allowed, so that 


J =Ser(Cye,(Z)) = J(z)se*/?, Jon =1-3-5++-(Qn—1). 


(The formula for J;,,, hence that of I,,, can be checked by a direct reasoning.) 
Generally, the EGF of permutations, all of whose cycles (in particular the largest one) have 
length at most equal to r satisfies 


r 


Bz) = exp (>: =) : 


jai J 
The numbers b{”) = [2"|B™ (z) satisfy the recurrence 


(n+ 100}, = (n+ 1/0.) — 


n-T? 
by which they can be computed fast. This gives access to the statistics of the longest cycle in a 
PeLMUtallOn es wt tt tL ea ened, oA cto CU bs Se eae Bt END OF EXAMPLE II.13. 


EXAMPLE II.14. Derangements and permutations without short cycles. Classically, a de- 
rangement is defined as a permutation without fixed points, i.e., 0; 7 for all i. Given an 
integer r, an r—derangement is a permutation all of whose cycles (in particular the shortest one) 
have length larger than r. Let PD“ be the class of all r—derangements. A specification is 


(38) D” = Set{Cyc>,{Z}}, 
the corresponding EGF being then 


(39) D(z) = exp (x: =) _ PH Liat F) 


j>r J 
For instance, when r = 1, a direct expansion yields 


De ain Ol! jell (ay? 
n! 1! 2! nt? 
a truncation of the series expansion of exp(—1) that converges fast to e~'. Phrased differently, 
the enumeration of derangements is a famous combinatorial problem with a pleasantly quaint 
nineteenth century formulation [98]: “A number n of people go to opera, leave their hats on 
hooks in the cloakroom and grab them at random when leaving; the probability that nobody gets 
back his own hat is asymptotic to 1/e, which is nearly 37%”. (The usual proof uses an inclusion- 
exclusion argument. Also, it is a sign of changing times that Motwani and Raghavan [370, p. 11] 
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Derangements Involutions Pairings 
e* e2te/2 e? [2 
1-z 


Longest cycle < r 
op (f+ 4-42) 


FIGURE II.7. A summary of major EGFs related to permutations. 


describe the problem as one of sailors that return in a state of inebriation and choose random 
cabins to sleep in.) For the generalized derangement problem, there holds, for any fixed r, 


Dy = He 


~e F 


(40) 


n! 
as is proved easily by complex asymptotic methods (Chapter IV). END OF EXAMPLE II. 14. 


Like several other structures that we have been considering previously, permu- 
tation allow for transparent connections between structural constraints and the forms 
of generating functions. The major counting results encountered in this section are 
summarized in Figure 7. 

[> IL12. Permutations such that of = Id. Such permutations are “roots of unity” in the 
symmetric group. Their EGF is 
d 
z 
exp ys "i ; 
dl\f 
where the sum extends to all divisors d of f. dq 


> I1.13. Parity constraints in permutations. The EGFs of permutations having only even size 
cycles (E(z)) or odd size cycles (O(z)) are 


1 1 1 1 1l+z 1l+z 
a (See) sal ee tg (Jet) =A tae 


From the EGFs, one finds Ha, = (1-3-5---(Qn— iy? Oo = Fan, Oonti = (Qn + 
1) Eon. 

The EGFs of permutations having an even number of cycles (£*(z)) and an odd number 

of cycles (O*(z)) are 
1 1 1 lt $27 
= 2-4-4 (¢*(z) =sinh(l 232 Siepiek 
fag Ota he ee ie oe 
so that parity of the number of cycles is evenly distributed amongst permutations of size n 
as soon as n > 2. The generating functions obtained in this way are analogous to the ones 
appearing in the discussion of “Comtet’s square” in the previous section. dq 


1 1 ee ne ae 


E*(z) = cosh(log 


> 11.14. A hundred prisoners I. This puzzle originates with a paper of Gal and Miltersen [224, 
499]. A hundred prisoners, each uniquely identified by a number between | and 100, have 
been sentenced to death. The director of the prison gives them a last chance. He has a cabinet 
with 100 drawers (numbered 1 to 100). In each, he’ll place at random a card with a prisoner’s 
number (all numbers different). Prisoners will be allowed to enter the room one after the other 
and open, then close again, 50 drawers of their own choosing, but will not in any way be allowed 
to communicate with one another. The goal of each prisoner is to locate the drawer that contains 
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his own number. If all prisoners succeed, then they will all be spared; if at least one fails, they 
will all be executed. 

There are two mathematicans amongst the prisoners. The first one, a pessimist, declares 
that their overall chances of success are only of the order of 1/21°° = 8 - 107%". The second 
one, a combinatorialist, claims he has a strategy for the prisoners, which has more than 30% 
chances of success. Who is right? [Note III.9, p. 165 provides a solution, but our gentle reader 
is advised to reflect on the problem for a few moments, before she jumps there.] <q 


II. 4.2. Second level structures. Consider the three basic constructors of la- 
belled sequence (SEQ), set (SET), and cycle (CYC). We can play the formal game 
of examining what the various combinations produce as combinatorial objects. Re- 
stricting attention to superpositions of two constructors (an external one applied to an 
internal one) gives nine possibilities summarized by the following table: 


ext. \int. 
Labelled compositions (L) Surjections (R) Alignments (O) 


SEQ0 SEQ SEQo SET SEQoCYC 
1-2z 1 1 
1—2z 2—e? 1 — log(1— z)-1! 
Fragmented permutations (F) Set partitions (S) Permutations (P) 


SETo SEQ SET o SET SETOCYC 


z 1 
2/(1-z) e*~—-1 
e e i 


Supernecklaces (S ‘) Supernecklaces (S17) Supernecklaces (S qth) 


CYCo SEQ CyCo SET CycoCyc 


1l-2z aes 
log J—5, log(2 — e*)~* log 


The classes of surjections, alignments, set partitions, and permutations appear 
naturally as SEQo SET, SEQoCyYC, SETo SET, and SEToCYC (top right corner). 
The other ones represent essentially nonclassical objects. The case of £ corresponding 
to SEQo SEQ describes objects that are (ordered) sequences of linear graphs; this can 
be interpreted as permutations with separators inserted, e.g, 53|264|1, or alternatively 
as integer compositions with a labelling superimposed, so that L, = n!2"~!. The 
class F = SET{SEQs,{Z}} corresponds to unordered collections of permutations; 
in other words, “fragments” are obtained by breaking a permutation into pieces (pieces 
must be nonempty for definiteness). The interesting EGF is 


ties 72 73 x4 
F(z) =e Sheets pr sa iS grt ns 
(EIS A000262: “sets of lists”). The corresponding asymptotic analysis serves to illus- 
trate an important aspect of the saddle point method in Chapter VIII. What we termed 
“supernecklaces”’ in the last row represents cyclic arrangements of composite objects 
existing in three brands. 
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All sorts of refinements, of which Figure 7 may give an idea, are clearly possible. 
We leave to the reader’s imagination the task of determining which amongst the level 3 
structures may be of combinatorial interest. . . 


> IL15. A meta-exercise: Counting specifications of level n. The algebra of constructions 
satisfies the combinatorial isomorphism SET{Cyc{V}} = SEQ{X} for all Y. How many 
different terms involving n constructions can be built from three symbols CYC, SET, SEQ sat- 
isfying a semi-group law (‘o”) together with the relation SETo Cyc = SEQ? This determines 
the number of specifications of level n. [Hint: the OGF is rational as normal forms correspond 
to words with an excluded pattern. ] <q 


II.5. Labelled trees, mappings, and graphs 


In this section, we consider labelled trees as well as other important structures that 
are naturally associated with them, namely mappings and functional graphs on one 
side, graphs of small excess on the other side. Like in the unlabelled case considered 
in Section I. 6, the corresponding combinatorial classes are inherently recursive, the 
case of trees being typical since a tree is obtained by appending a root to a collection 
(set, sequence) of subtrees. From there, it is possible to build the graphs associated 
to mappings from a finite set to itself, as these decompose as sets of connected com- 
ponents that are cycles of trees. Variations of these construction finally open access 
to the enumeration of graphs having a fixed excess of the number of edges over the 
number of vertices. 


II. 5.1. Trees. The trees to be studied here are invariably labelled, so that nodes 
bear distinct integer labels. Unless otherwise specified, they are rooted, meaning as 
usual that one node is distinguished as the root. Labelled trees, like their unlabelled 
counterparts, exist in two varieties: (i) plane trees where an embedding in the plane 
is understood (or, equivalently, subtrees dangling from a node are ordered, say, from 
left to right); (7) nonplane trees where no such embedding is imposed (such trees are 
then nothing but connected undirected acyclic graphs with a distinguished root). Trees 
may be further restricted by the additional constraint that the node outdegrees should 
belong to a fixed set 2 C Zso where 2 5 0. 


oe & (3,2,5,1,7, 4, 6) 


FIGURE II.8. A labelled plane tree is determined by an unlabelled tree (the “‘shape’’) 
and a permutation of the labels 1,...,n. 
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FIGURE II.9. There are T; = 1,T> = 2,73 = 9, and in general T;,, = n”~* Cayley 
trees of size n. 


Plane labelled trees. We first dispose of the plane variety of labelled trees. Let 
A be the set of (rooted labelled) plane trees constrained by 2. This family is specified 
by 
A = Z x SEQo{A}, 


where Z represents the atomic class consisting of a single labelled node: Z = {1}. 
The sequence construction appearing here reflects the planar embedding of trees, as 
subtrees stemming from a common root are ordered between themselves. Accord- 
ingly, the EGF A(z) satisfies 


A(z) = 2¢(A(z)) where (u) = Sou’. 

wEQ 
This is exactly the same equation as the one satisfied by the ordinary GF of Q- 
restricted unlabelled plane trees (see Proposition I.5). Thus, + An is the number 
of unlabelled trees. In other words: in the plane rooted case, the number of labelled 
trees equals n! times the corresponding number of unlabelled trees. As illustrated by 
Figure 8, this is easily understood combinatorially: each labelled tree can be defined 
by its “shape” that is an unlabelled tree and by the sequence of node labels where 
nodes are traversed in some fixed order (preorder, say). Finally, one has, by Lagrange 
inversion, 


An = [2] A(z) = (n — 1)![u"*]o(u)”. 


This simple analytic-combinatorial relation enables us to transpose all of the enumer- 
ative results of Section I.5.1 to plane labelled trees (upon multiplying the evaluations 
by n!, of course). In particular, the total number of “general” plane labelled trees (with 
no degree restriction imposed, i.e., 2 = Z50) is 


1 /2n-—2 (2n — 2)! 

Ix = SO Bo (2 8)): 
mx (07?) (n —1)! ( ee) 

The corresponding sequence starts as 1, 2, 12, 120, 1680 and is EJS A001813. 


Nonplane labelled trees. We next turn to labelled nonplane trees (Figure 9) to 
which the rest of this section will be devoted. The class T of all such trees is definable 
by a symbolic equation, which provides an implicit equation satisfies by the EGF: 


(41) T = ZxSet{T} => = T(z) = ze™), 


There the set construction translates the fact that subtrees stemming from the root are 
not ordered between themselves. From the specification (41), the EGF T(z) is defined 
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implicitly by the “functional equation” 
(42) T(z) = ze), 


The first few values are easily found, for instance by the method of indeterminate 
coefficients, 
72 3 A me 
Pz) =2+25 +95 + 6477 + 6255 + : 
As suggested by the first few coefficients(9 = 37,64 = 4°,625 = 54), the general 
formula is 


(43) T, =n! 


which is established (like in the case of plane unlabelled trees, Chapter I) by the La- 
grange Inversion Theorem (see APPENDIX A: Lagrange Inversion, p. 677). 

The enumerative result T;, = n”~! is a famous one, attributed to the prolific 
British mathematician Arthur Cayley (1821-1895) who had keen interest in combina- 
torial mathematics and published altogether over 900 papers and notes. Consequently, 
formula (43) given by Cayley in 1889 is often referred to as “Cayley’s formula” and 
unrestricted nonplane labelled trees are often called “Cayley trees”. See [54, p. 51] for 
a historical discussion. The function T(z) is also known as the (Cayley) “tree func- 
tion”; it is a close relative of the W—function [100] defined implicitly by We” = z, 
which was introduced by the Swiss mathematician Johann Lambert (1728-1777) oth- 
erwise famous for first proving the irrationality of the number z. 

A similar process gives the number of (nonplane rooted) trees where all (out)degrees 
of nodes are restricted to lie in a set 2. This corresponds to the specification: 

T) = ZxSEte{T™} = T(z) = 26(T(z)) where Gu) = D> —. 

we ek 
What the last formula involves is the “exponential characteristic” of the degree se- 
quence (as opposed to the ordinary characteristic, in the planar case). It is once more 
amenable to Lagrange inversion. In summary: 


Proposition II.5. The number of rooted nonplane trees, where all nodes have their 
outdegree in Q, is 


Ww 


TO = (n —1)!u"—-"] (O(u))” where (u) = S- . 


wl 


In particular, when all node degrees are allowed (Q = Z>0), the number of trees is 
T, = n"~! and its EGF is the Cayley tree function satisfying T(z) = zel(?), 

> 11.16. Priifer’s bijective proofs of Cayley’s formula. The simplicity of Cayley’s formula calls 
for a combinatorial explanation. The most famous one is due to Priifer (in 1918). It establishes 
as follows a bijective correspondence between unrooted Cayley trees whose number is n”~? for 
size n and sequences (a1, a Gn—2) with 1 < a; < n for each j. Given an unrooted tree 7, 
remove the endnode (and its incident edge) with the smallest label; let a; denote the label of 
the node that was joined to the removed node. Continue with the pruned tree 7’ to get a2 ina 
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similar way. Repeat the construction of the sequence until the tree obtained only consists of a 
single edge. For instance: 


3 2 
te 4 8 
5  —. (4,8,4,8,8,4). 
7 6 
It can be checked that the correspondence is bijective; see [54, p. 53] or [364, p. 5]. dq 


(> IL17. Forests. The number of unordered k—forests (i.e., k-sets of trees) is 


ro me Eg = ee tnter = (Fo ]) 


as follows from Biirmann’s form of Lagrange inversion. dq 


> IL.18. Labelled hierarchies. The class £ of labelled hierarchies is formed of trees whose 
internal nodes are unlabelled and are constrained to have outdegree larger than 1, while leaves 
have labels attached to them. Like for other labelled structures, size is the number of labels (so 
that internal nodes do not contribute). Hierarchies satisfy the specification 


L=Z+SETs2{L}, L=z+e"-1-L. 


This happens to be solvable in terms of the Cayley function: L(z) = T($e7/? 1?) +2-4. 


The first few values are 0, 1, 4, 26, 236 (EIS A000311): these numbers count phylogenetic trees 
(used to describe the evolution of a genetically related group of organisms) and correspond to 
Schréder’s “fourth problem”; see [98, p. 224] and Note 1.42, p. 68, for unlabelled analogues. 

The class of binary (labelled) hierarchies defined by the additional fact that internal nodes 
can have degree 2 only is expressed by 


M = Z+ SET2{M} => M(z)=1-V1-—2z and M, =1-3---(2n—83), 


where the counting numbers are now the odd factorials. dq 


II. 5.2. Mappings and functional graphs. Let F be the class of mappings (or 
“functions”) from [1..n] to itself. A mapping f € [1..n] + [1..n] can be repre- 
sented by a directed graph over the set of vertices [1 ..n] with an edge connecting x 
to f(x), for all a € [1..n]. The graphs so obtained are called functional graphs and 
they have the characteristic property that the outdegree of each vertex is exactly equal 
to 1. 


Mappings and associated graphs. Given a mapping (or function) f, upon start- 
ing from any point xo, the succession of (directed) edges in the graph traverses the 
vertices corresponding to iterated values of the mapping, 


to, f(to), f(F(®o)),--+» 


Since the domain is finite, each such sequence must eventually loop on itself. When 
the operation is repeated starting each time from an element not previously hit, the 
vertices group themselves into components. This leads to another characterization 
of functional graphs (Figure 10): A functional graph is a set of connected functional 
graphs. A connected functional graph is a collection of rooted trees arranged in a 
cycle. 
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Thus, with J being as before the class of all Cayley trees, and with K the class of 
all connected functional graphs, we have the specification: 


F = Ser{K} F(z) = e*@) 
1 
(44) K = Cyc{T} => K(z) = log T_T) 
T = ZxSeET{T} T(z) = zeT(), 


What is especially interesting here is a specification binding three types of related 
structures. From Equation (44), the EGF F'(z) is found to satisfy F = (1 — T)~1. It 
can be checked from there, by Lagrange inversion once again, that we have 


=n 
Ff, =n", 


as was to be expected (!) from the origin of the problem. More interestingly, Lagrange 
inversion also provides for the number of connected functional graphs (expand log(1— 
T)~! and recover coefficients by Biirmann’s form): 
(45) Ky=n"1Q(n) where Q(n):=1+ nat + ae + 

The quantity Q(n) that appears in (45) is a famous one that surfaces in many prob- 
lems of discrete mathematics (including the birthday paradox, Equation (27)). Knuth 
has proposed to call it “Ramanujan’s Q-function” as it already appears in the first let- 
ter of Ramanujan to Hardy in 1913. The asymptotic analysis can be done elementarily 
by developing a continuous approximation of the general term and approximating the 
resulting Riemann sum by an integral: this is an instance of the Laplace method for 
sums briefly explained in APPENDIX B: Laplace’s method, p. 700. (See also [306, 
Sec. 1.2.11.3] and [434, Sec. 4.7].) In fact, very precise estimates come out naturally 
from an analysis of the singularities of the EGF A(z), as we shall see in Chapters VI 


and VII. The net result is 
/ 1 
Ky ~n” ee) 
i 2n 


FIGURE II.10. A functional graph of size n = 26 associated to the mapping y such 
that y(1) = 16, (2) = (3) = 11, y(4) = 23, and so on. 
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so that a fraction about 1/,/n of all the graphs consist of a single component. 


Constrained mappings. As is customary with the symbolic method, the construc- 
tions (44) also lead to a large number of related counting results. First, the mappings 
without fixed points, ((Vz) f(a) 4 x) and those without 1, 2—cycles, (additionally, 
(Vx) f(f(x)) 4 x), have EGFs 
—T(z) e-L(2)-T? (z)/2 

1—T(z)’ 1—T(z) 
The first equation is consistent with what a direct count yields, namely (n — 1)”, 
which is asymptotic to e~!n”, so that the fraction of mappings without fixed point is 
asymptotic to e~ +. The second one lends itself easily to complex-asymptotic methods 
that give 


—T-T?/2 

nl fe") ——_— wenn, 
and the proportion is asymptotic to e~*/?. These two particular estimates are of the 
same form as what has been found for permutations (the generalized derangements, 
Eq. (40)). Such facts that are not quite obvious by elementary probabilistic arguments 
are in fact neatly explained by the singular theory of combinatorial schemas developed 
in Part B of this book. 

Next, idempotent mappings satisfying f(f(x)) = f(a) for all x correspond to 


T = Set{Z x SET{Z}}, so that 


n 
I(2)=e" and In= ) Krk, 
(2) » k 

(The specification translates the fact that idempotent mappings can have only cycles 
of length 1 on which are grafted sets of direct antecedents.) The latter sequence 
is EIS A000248, which starts as 1,1,3,10,41,196,1057. An asymptotic estimate can 
be derived either from the Laplace method or, better, from the saddle point method 
exposed in Chapter VIII. 

Several analyses of this type are of relevance to cryptography and the study of 
random number generators. For instance, the fact that a random mapping over [1 . . n] 
tends to reach a cycle in O(,/n) steps led Pollard to design a Monte Carlo integer 
factorization algorithm, see [307, p. 371] and [434, Sec 8.8]. The algorithm once 
suitably optimized first led to the factorization of the Fermat number Fg = oe 41 
obtained by Brent in 1980. 
> IL.19. Binary mappings. The class BF of binary mappings, where each point has either 0 
or 2 preimages, is specified by 

BF = SetT{K}, K =Cyc{P}, P= Z*B, B= Z x SETo,2{B} 
(planted trees P and binary trees GB are needed), so that 
1 Br, = ((2n)!)*_ 
Viaoe 2" (nl)? 
The class BF is an approximate model of the behaviour of (modular) quadratic functions under 


iteration. See [14, 198] for a general enumerative theory of random mappings including degree- 
restricted ones. 


BF(z)= 
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All mappings i Injective partial Surjection Bijection 
1 1 e2/A-2) 1 1 


1-T l-—z 2 —e% l-—<z 


Connected (K) No fixed point Involution Idempotent 
on 


e erte/2 
Li? 1-T 


log 


FIGURE IJ.11. A summary of various counting EGFs relative to mappings. 


> 11.20. Partial mappings. A partial mapping may be undefined at some points, where it can 
be considered as taking a special value, 1. The iterated preimages of | form a forest, while 
the remaining values organize themselves into a standard mapping. The class PF of partial 
mappings is thus specified by PF = SET{T} x Ff, so that 


z el) 
~ 1-T(z) 
This construction lends itself to all sorts of variations. For instance, the class PF'I of injective 


partial maps is described as sets of chains of linear and circular graphs, PF J = SeT{Cyc{Z}+ 
SEQs1{Z}}, so that 


my 2 
PFI(z) = ee, PRLS > (") 
z v 
i=0 


PF(z) and PF, =(n+4+1)". 


(This is a symbolic rewriting of part of the paper [62].) dq 
The symbolic method thus gives access to a wide variety of counting results rela- 
tive to maps satisfying diverse constraints. A summary is offered in Figure 11. 


II. 5.3. Labelled graphs. Random graphs form a major chapter of the theory of 
random discrete structures [60, 283]. We examine here enumerative results concerning 
graphs of low “complexity”, that is, graphs which are very nearly trees. (Such graphs 
for instance play an essential réle in the analysis of early stages of the evolution of a 
random graph, when edges are successively added, as shown in [193, 282].) 


Unrooted trees and acyclic graphs. The simplest of all connected graphs are 
certainly the ones that are acyclic. These are trees, but contrary to the case of Cayley 
trees, no root is specified. Let U/ be the class of all unrooted trees. Since a rooted tree 
(rooted trees are, as we know, counted by T,, = n”—") is an unrooted tree combined 
with a choice of a distinguished node (there are n possible such choices for trees of 
size n), one has 

Tn =nUyn implying =U =n? 
At generating function level, this combinatorial equality translates into 


which integrates to give (take T’ as the independent variable) 


U(z) = T(z) - aFte. 
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Since U(z) is the EGF of acyclic connected graphs, the quantity 
A(z) _ eU() = eF @)-T@)?/2 


is the EGF of all acyclic graphs. (Equivalently, these are unordered forests of un- 
rooted trees.) Methods developed in Chapters VI and VII imply the estimate A, ~ 
e'/?n"—?. Surprisingly, perhaps, there are barely more acyclic graphs than unrooted 
trees—such phenomena are easily explained by singularity analysis. 


Unicyclic graphs. The excess of a graph is defined as the difference between the 
number of edges and the number of vertices. For a connected graph, this quantity 
must be at least —1, the minimal value —1 being precisely attained by unrooted trees. 
The class Wj, is the class of connected graphs of excess equal to k; in particular 
U = W_,. The successive classes W_1, Wo, Wi,..., may be viewed as describing 
connected graphs of increasing complexity. 

The class Wo comprises all connected graphs with the number of edges equal 
to the number of vertices. Equivalently, a graph in Wo is a connected graph with 
exactly one cycle (a sort of “eye’’), and for that reason, elements of Wo are sometimes 
referred to as “unicyclic components” or “unicycles”. In a way, such a graph looks 
very much like an undirected version of a connected functional graph. Precisely, a 
graph of Wo consists of a cycle of length at least 3 (by definition, graphs have neither 
loops nor multiple edges) that is undirected (the orientation present in the usual cycle 
construction is killed by identifying cycles isomorphic up to reflection) and on which 
are grafted trees (these are implicitly rooted by the point at which they are attached 
to the cycle). With UCYC representing the (new) undirected cycle construction, one 
thus has 

Wo = UCyYcs3{T}. 
We claim that this construction is reflected by the EGF equation 


: : uO) - 1 ray, 


(46) Wo(z) = i=te 2 7 


Indeed one has the isomorphism 
Wo + Wo = Cycs3{T}, 


since we may regard the two disjoint copies on the left as instantiating two possible 
orientations of the undirected cycle. The result of (46) then follows from the usual 
translation of the cycle construction. It is originally due to the Hungarian probabilist 
Rényi in 1959. Asymptotically, one finds (by methods of Chapter VI): 

1 5 1 
(47) n![2"|Wo ~ av Qnn?—V/2 _ 30 + ra Orn ol Ben 
Finally, the number of graphs made only of trees and unicyclic components is 
el /2—-3T? /4 


V1-T 
and asymptotically: n![z"JeW-1+Wo ~ 1(3/4)2-'/4e-V/2q—-V/2n"—-1/4, Such graphs 
stand just next to acyclic graphs in order of structural complexity. They are the undi- 
rected counterparts of functional graphs encountered in the previous section. 


eW-1(2)+Wolz) — 
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FIGURE II.12. A summary of major enumeration results relative to labelled graphs of 
small excess. 


> IL.21. 2-Regular graphs. This is based on Comtet’s account [98, Sec. 7.3]. A 2-regular graph 
is an undirected graph in which each vertex has degree exactly 2. Connected 2—regular graphs 
are thus undirected cycles of length n > 3, so that the EGF of all 2-regular graphs is 


en 2/227 /4 


V1l-z 
Given n straight lines in general position, a cloud is defined to be a set of n intersection points 
no three being collinear. Clouds and 2-regular graphs are equinumerous. [Hint: Use duality. ] 
The asymptotic analysis will serve as a leading example of the singularity analysis process in 
Chapter VI (Examples VI.1, p. 363 and VI.2, p. 378). 

The general enumeration of r—regular graphs becomes somewhat more difficult as soon 
as r > 2. Algebraic aspects are discussed in [234, 244] while Bender and Canfield [31] have 
determined the asymptotic formula (for rn even), 


R(z) = 


r/2 
RO) we VBP —D/A LE ona 


e”/2r! ; 


for the number of r—regular graphs of size n. dq 


Graphs of fixed excess. The previous discussion suggests considering more gen- 
erally the enumeration of connected graphs according to excess. E. M. Wright made 
important contributions in this area [507, 508, 509] that are revisited in the famous 
“giant paper on the giant component” by Janson, Knuth, Luczak, and Pittel [282]. 
Wright’s result are summarized by the following proposition. 


Proposition II.6. The EGF W;,(z) of connected graphs with excess (of edges over 

vertices) equal to k is, for k > 1, of the form 
P,(T) 

(1 — T)3*’ 


where P;, is a polynomial of degree 3k + 2. For any fixed k, as n — ov, one has 
Py (1) V 20 e 
W, = nil[z” = n+(3k—1)/2 
(49) kon = 1[2"|We(z) = 53k/2p iy” 


The combinatorial part of the proof (see Note 22 below) is an interesting exercise 
in graph surgery and symbolic methods. The analytic part of the statement follows 
straightforwardly from singularity analysis. The polynomials P(T) and the constants 


(48) Wi,(z) = T=T(z), 


(1+ 0(n-¥?)). 
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P,,(1) are determined by an explicit nonlinear recurrence; one finds for instance: 
ae! T*(6—T) W = 1 T4(2 + 28T — 23T? + 9T? — T*) 
— DAA eye > 48 Ga77 


> 11.22. Wright’s surgery. The full proof of Proposition II.6 by symbolic methods requires 
the notion of pointing in conjunction with multivariate generating function techniques of Chap- 


ter III. It is convenient to define wz(z,y) := y*W2(zy), which is a bivariate generating 
function with y marking the number of edges. Pick up an edge in a connected graph of excess 
k + 1, then remove it. This results either in a connected graph of excess k with two pointed 
vertices (and no edge in between) or in two connected components of respective excess h and 


k; — h, each with a pointed vertex. Graphically: 
k+1 


2OyWry = (2702 we — 2yOy wr) + is (zOzwn) - (zOzWr-n) , 


h=-1 


WwW, 


This translates into the differential recurrence on the wz (Oz := 2). 


and similarly for W;,(z) = wz(z, 1). From there, it can be verified by induction that each W;, 
is a rational function of T = W_ . (See Wright’s original papers [507, 508, 509] or [282] for 
details.) J 
As explained in the giant paper [282], such results combined with complex ana- 
lytic techniques provide with great detail information on the aspect of a random graph 
I'(n,m) with n nodes and m edges. In the sparse case where m is of the order of n, 
one finds the following properties to hold “with high probability” (w.h.p.)’, that is, 
with probability tending to 1 asn — oo. 
e For m = pn, with p < 4, the random graph I'(m,n) has w.h.p. only tree 
and unicycle components; the largest component is w.h.p. of size O(log n). 
e Form = $n + O(n?/3), w.h.p. there appear one or several semi-giant 
components that have size O(n?/*). 
e Form = pn, with u > 4, there is w.h.p. a unique giant component of size 
proportional to n. 


In each case, refined estimates follow from a detailed analysis of corresponding gen- 
erating functions, which is a main theme of [193] and especially [282]. Raw forms 
of these results were first obtained by Erdés and Rényi who launched the subject in a 
famous series of papers dating from 1959-60; see the books [60, 283] for a probabilis- 
tic context and the paper [32] for the finest counting estimates available. In contrast, 
the enumeration of all connected graphs (irrespective of the number of edges, that is, 
without excess being taken into account) is a relatively easy problem treated in the 
next section. Many other classical aspects of the enumerative theory of graphs are 
covered in the book Graphical Enumeration by Harary and Palmer [259]. 


7Synonymous expressions are “asymptotically almost surely” (a.a.s) and “in probability”. The term 
“almost surely” is sometimes used, though it lends itself to confusion with properties of continuous 
measures. 
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II. 6. Additional constructions 


Like in the unlabelled case, pointing and substitution are available in the world 
of labelled structures (Section II. 6.1), and implicit definitions enlarge the scope of 
the symbolic method (Section II. 6.2). The inversion process needed to enumerate 
implicit structures is even simpler, since in the labelled universe sets and cycles have 
more concise translations as operators over EGF. Finally, and this departs significantly 
from Chapter I, the fact that integer labels are naturally ordered makes it possible to 
take into account certain order properties of combinatorial structures (Section ??). 


II. 6.1. Pointing and substitution. The pointing operation is entirely similar to 
its unlabelled counterpart since it consists in distinguishing one atom amongst all the 
ones that compose an object of size n. The definition of composition for labelled 
structures is however a bit more subtle as it requires singling out “leaders” in sub- 
stituends. 


Pointing. The pointing of a class B is defined by 
A= 0B iff An = [1..n] x Bn. 


In other words, in order to generate an element of A, select one of the n labels and 
point at it. Clearly 
d 
A, =n-B, = A(z) = z— Bz). 
z 
Substitution (composition). The composition or substitution can be defined so 
that it corresponds a priori to composition of generating functions. It is formally 
defined as 


BoC=)S_ By x SET: {C}, 
k=0 
so that its EGF is 


oo k 
S- Be ae) = B(C(z)). 
k=0 

A combinatorial way of realizing this definition and form an arbitrary object of BoC, 
is as follows. First select an element of @ € B called the “base” and let k = || be 
its size; then pick up a k-set of C*; the elements of the k—set are naturally ordered 
by value of their “leader” (the leader of an object being by convention the value of 
its smallest label); the element with leader of rank r is then substituted to the node 
labelled by value r of 2. 


Theorem II.3. The combinatorial constructions of pointing and substitution are ad- 
missible. 


d 
A=0B = A(z) =20,B(z), 0, = rs 
A=BoC = A(z) = B(C(z)). 
For instance, the EGF of (relabelled) pairings of elements drawn from C is 
eC)tE2)?/2, 


; F F : 2 
since the EGF of involutions is e2+ /2. 
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> 11.23. Standard constructions based on substitutions. The sequence class of A may be de- 
fined by composition as P oA where P is the set of all permutations. The set class of A may be 
defined as U/ o A where U is the class of all urns. Similarly, cycles are obtained by substitution 
into circular graphs. Thus, 


SEQ(A) = Po A, SET(A) =UOA, Cyc(A) = CoA. 


In this way, permutation, urns and circle graphs appear as archetypal classes in a development 
of combinatorial analysis based on composition. (Joyal’s “theory of species” [286] and the 
book by Bergeron, Labelle, and Leroux [39] make a great use of such ideas and show that an 
extensive theory of combinatorial enumeration can be based on the concept of substitution.) << 


> IL.24. Distinct component sizes. The EGFs of permutations with cycles of distinct lengths 
and of set partitions with parts of distinct sizes are 


[la+5), []a+45). 


The probability that a permutation of P, has distinct cycle sizes tends to e 7; see [249, 
Sec. 4.1.6] for a Tauberian argument and [400] for precise asymptotics. The corresponding 
analysis for set partitions is treated in the seven author paper [295]. dq 


3 
ll 
ua 


II. 6.2. Implicit structures. Let V be a labelled class implicitly defined by ei- 
ther of the equations 
A=B+4+4, A=BxX. 
Then, solving the corresponding EGF equations leads to 
A 
X(@)=AW)-B@), x)= FS. 


respectively. For the composite labelled constructions SEQ, SET, CYC, the algebra is 
equally easy. 


Theorem II.4 (Implicit specifications). The generating functions associated to the 
implicit equations in X 


A = SEQ(*), A = SET(*), A=Cyc(X), 


are respectively 


X(z)=1-—~, = X(z)=log A(z), = X(z) = 1-4). 
A(z) 
EXAMPLE II.15. Connected graphs. In the context of graphical enumerations, the labelled 
set construction takes the form of an enumerative formula relating a class of graphs G and the 
subclass of its connected graphs KC C G: 


G =SEI(K) => G(z) =e%™), 


This basic formula is known in graph theory [259] as the exponential formula. 
Consider the class G of all (undirected) labelled graphs, the size of a graph being the 


number of its nodes. Since a graph is determined by the choice of its set of edges, there are (5) 
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potential edges each of which may be taken in or out, so that G, = 2(2). Let K C G be the 
subclass of all connected graphs. The exponential formula determines K (z) implicitly, 


K(z) = log (1 +\> (9) =) 


n>1 
2 3 4 5 
Zz z z z 
= aoe tag Toa hee ts 
where the sequence is EJS A001187. The series is divergent, that is, it has radius of conver- 
gence 0. It can nonetheless be manipulated as a formal series (APPENDIX A: Formal power 
series, p. 676). Expanding by means of log(1 + uv) = u— u?/2+---, yields a complicated 


convolution expression for Kp: 


K,=20 2 aa +8) 4B ee) aCD+C2)H(9) 


(The kth term is a sum over nj +---+nx =n, with O <n; <n.) Given the very fast increase 
of Gy, with n, for instance 


n+1 n 
ol 2) = 2” 9(2) 
a detailed analysis of the various terms of the expression of K, shows predominance of the first 
sum, and, in that sum itself, predominance of the extreme terms corresponding ton; = n — 1 
or n2 = n — 1, so that 


(50) Kn = 2(2) (1—2n2-" + 0(27)). 


Thus, almost all labelled graphs of size n are connected. In addition, the error term decreases 
very fast: for instance, for n = 18, an exact computation based on the generating function for- 
mula reveals that a proportion only 0.0001373291074 of all the graphs are not connected—this 
is extremely close to the value 0.00013732910/6 predicted by the main terms in the asymptotic 
formula (50). Notice that here good use could be made of a purely divergent generating function 


for asymptotic enumeration purposes. ..................000005 END OF EXAMPLEII.15. 


> II.25. Bipartite graphs. A plane bipartite graph is a pair (G’, w) where G is a labelled graph, 
w = (ww,wez) is a bipartition of the nodes (into West and East categories), and the edges are 
such that they only connect nodes from ww to nodes of wz. A direct count shows that the EGF 
of plane bipartite graphs is 


k 


The EGF of plane bipartite graphs that are connected is log ['(z). 
A bipartite graph is a labelled graph whose nodes can be partitioned into two groups so 
that edges only connect nodes of different groups. The EGF of bipartite graphs is 


Sen ($1 r(2)) rR, 


{Hint. The EGF of a connected bipartite graph is 4 log T(z) as a factor of 4 kills the East— 
West orientation present in a connected plane bipartite graph. See Wilf’s book [496, p. 78] for 
details.] 


> IL.26. Do two permutations generate the symmetric group? To two permutations o, 7 of the 
same size, associate a graph [',,, whose set vertices is V = [1..n], ifn = |o| = |r|, and 


II. 6. ADDITIONAL CONSTRUCTIONS 129 


set of edges is formed of all the pairs (x,o(x)), («,7(x)), for x € V. The probability that a 
random I,,, is connected is 


1 n n 
Tr = alt ] log Si nlz 


n>0 


This represents the probability that two permutations generate a transitive group (that is for all 
x,y € [0..n], there exists a composition of ¢,0~!,7,7~1 that maps x to y). One has 


1 1 4 23 #171 1542 
on sae n ne mn nt no n& 
Surprisingly, the coefficients 1,1, 4,23,... [EZS A084357] in (51) enumerate a “third-level” 
structure (cf Subsection II. 4.2): SET(SET>1(SEQs,(Z))). Also, one has n!? 7m, = (n—1)Un, 
where I,,41 is the number of indecomposable permutations (Example I.17, p. 82). 
Let 77, be the probability that two random permutations generate the whole symmetric 
group. Then, by a result of Babai based on the classification of groups, the quantity 7, — 77 is 
exponentially small, so that (51) also applies to 77. [Based on Dixon [130].] J 


eG 


> II.27. Graphs are not specifiable. The class of all graphs does not admit a specification start- 
ing from single atoms and involving only sums, products, sets and cycles. Indeed, the growth 
of G,, is such that the EGF Gz) has radius of convergence 0, whereas EGFs of constructible 
classes must have a nonzero radius of convergence, as proved in Chapter IV. <q 


II. 6.3. Order constraints. A construction well suited to taking into account 
many order properties of combinatorial structures is the modified labelled product, 


A= (BP xo). 


This denotes the subset of the product B «C formed with elements such that the small- 
est label is constrained to lie in the 6 component. (To make this definition consistent, 
it must be assumed that By = 0.) We call this binary operation on structures the boxed 
product. 


Theorem ILS. The boxed product is admissible. 


(52) A=(B?xC) Aw) = ff (0,B(t))-C(t)dt, =<. 


PROOF. The definition of boxed products implies the coefficient relation 


The binomial coefficient that appears in the standard labelled product is now modified 
since only n — 1 labels need to be distributed between the two components, k — 1 
going to the B component (that is constrained to contain the label 1 already) and n—k 
to the C component. From the equivalent form 


Ae =Se i) (KBr) Ca—ks 


k=0 


the result follows by taking EGFs. 
A useful special case is the min—rooting operation, 


A =I ee, 
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FIGURE IJ.13. A numerical sequence of size 100 with records marked by circles: there 
are 7 records that occur at times 1, 3, 5, 11, 60, 86, 88. 


for which a variant definition goes as follows. Take in all possible ways elements 
y € C, prepend an atom with a label smaller than the labels of y, for instance 0, and 
relabel in the canonical way over [1 . . (n+1)] by shifting all label values by 1. Clearly 
An+1 = Cy which yields 


A(z) = | “ C(t) at, 


a result also consistent with the general formula of boxed products. 
For some applications, it is easier to impose constraints on the maximal label 
rather than the minimum. The max-boxed product written 


A = (BE xC), 


is then defined by the fact the maximum is constrained to lie in the B-component of 
the labelled product. Naturally, the translation by an integral in (52) remains valid for 
this trivially modified boxed product. 


> II.28. Combinatorics of integration. In the perspective of this book, integration by parts has 
an immediate interpretation. Indeed, the equality, 


a A'(t)- B(t) dt = A(z)- B(z) — [ A(t) - B’(t) dt, 
(0) 0) 


reads off as: “The smallest label in an ordered pair, if it appears on the left, cannot appear on 


the right.” <q 
EXAMPLE II.16. ~— Records in permutations. Given a sequence of numerical data, x = 
(@1,...,%n) assumed all distinct, a record in that sequence is defined to be an element x; 


such that x, < x; for all k < j. (A record is an element “better” than its predecessors!) Fig- 
ure 13 displays a numerical sequence of length n = 100 that has 7 records. Confronted to such 
data, a statistician will typically want to determine whether the data obey purely random fluctu- 
ations or there could be some indications of a “trend” or of a “bias” [108, Ch. 10]. (Think of the 
data as reflecting share prices or athletic records, say.) In particular, if the x; are independently 
drawn from a continuous distribution, then the number of records obeys the same laws as in a 
random permutation of [1 ..n]. This statistical preamble then invites the question: How many 
permutations of n have k records? 
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First, we start with a special brand of permutations, the ones that have their maximum at 
the beginning. Such permutations are defined as (‘MI indicates the boxed product based on the 
maximum label) 

Q=(Z" xP), 
where P is the class of all permutations. Observe that this gives the EGF 


*fd ul 1 
ac) = | ($!) pate, 


implying the obvious result Qn = (n — 1)! for all n > 1. These are exactly the permutations 
with one record. Next, consider the class 


P) = SET, (Q). 


The elements of P“*) are unordered sets of cardinality k with elements of type Q. Define 


the (max) leader of any component of P) as the value of its maximal element. Then, if we 
place the components in sequence, ordered by increasing values of their leaders, then read off 
the whole sequence, we obtain a permutation with k records exactly. The correspondence’ is 
clearly revertible. Here is an illustration, with leaders underlined: 


{(Z, 2,6, 1), (4,3), (98,5)} = [(4,3), (7,2,6,1), (9,8, 5))] 
4,3, 7, 2,6, 1,9, 8,5. 


II 


Thus, the number of permutations with k records is determined by 


1 1 . n 
ple pk 
(a) = (18 +) , m i: 


where we recognize Stirling cycle numbers from Example 12. In other words: 
The number of permutations of size n having k records is counted by the 
Stirling “cycle” number [7]. 
Returning to our statistical problem, the treatment of Example 12 p. 112 (to be revisited 
in Chapter III) shows that the expected number of records in a random permutation of size n 
equals H,,, the harmonic number. One has Hio9 = 5.18, so that for 100 data items, a little 
more than 5 records are expected on average. The probability of observing 7 records or more 
is still about 23%, an altogether not especially rare event. In contrast, observing twice as many 
records, that is, 14, would be a fairly strong indication of a bias since, on random data, the 
event has probability very close to 10~*. Altogether, the present discussion is consistent with 
the hypothesis for the data of Figure 13 to have been generated independently at random (and 
indeed they were). ...... 0... cece cece cence een eens END OF EXAMPLE II.16. 


It is possible to base a fair part of the theory of labelled constructions on sums and 
products in conjunction with the boxed product. In effect, consider the three relations 


1 
ag Ve 
F=Ser{G} = f(2)=, fais fof 


1 al 
F=Cyc{G} == VS PS aay f= fo 


F=Se{G} = f(2)= 


8This correspondence can also be viewed as a transformation on permutations that maps the number 
of records to the number of cycles—it is known as Foata’s fundamental correspondence [337, Sec. 10.2]. 
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The last column is easily checked to provide an alternative form of the standard op- 
erator corresponding to sequences, sets, and cycles. Each case is then itself deduced 
directly from Theorem II.5 and the labelled product rule: 


Sequences: they obey the recursive definition 
F=SEQUG} = Fite + (GxF). 
Sets: we have 
F=SeT{G} => FX {e+ (G"«F), 


which means that, in a set, one can always single out the component with 
the largest label, the rest of the components forming a set. In other words, 
when this construction is repeated, the elements of a set can be canonically 
arranged according to increasing values of their largest labels, the “leaders”. 
(We recognize here a generalization of the construction used for records in 
permutations.) 

Cycles: The element of a cycle that contains the largest label can be taken 
canonically as the cycle “starter”, which is then followed by an arbitrary 
sequence of elements upon traversing the cycle in circular order. Thus 


F=Cyc{G} = F2(G™«SEQ{G}). 


Greene [251] has developed a complete framework of labelled grammars based 
on standard and boxed labelled products. In its basic form, its expressive power is 
essentially equivalent to ours, because of the above relations. More complicated order 
constraints, dealing simultaneously with a collection of larger and smaller elements, 
can be furthermore taken into account within this framework. 


> IL.29. Higher order constraints, after Greene. Let the symbols U1, J, Ml represent smallest, 
second smallest, and largest labels respectively. One has the correspondences (with 0, = Zz 
A = (B® «c™) 8?.A(z) = (82B(z)) « (A:C(z)) 
A=(po" *C) 02 A(z) = (02B(z)) - C(z) 


A = (Bx DP) 03.A(z) = (0:B(z))- (8:C(z)) - (a:D(z)), 


and so on. These can be transformed into (iterated) integral representations. [See [251] for 
more. ] 


The next two examples demonstrate the usefulness of min-rooting used in con- 
junction with recursion. In this way, trees satisfying some order conditions can be 
constructed and enumerated easily. This is in turn gives access to new characteristics 
of permutations. 


EXAMPLEII.17. Increasing binary trees and alternating permutations. To each permutation, 
one can associate bijectively a binary tree of a special type called an increasing binary tree and 
sometimes a heap-ordered tree or a tournament tree. This is a plane rooted binary tree in which 
internal nodes bear labels in the usual way, but with the additional constraint that node labels 
increase along any branch stemming from the root. Such trees are closely related to classical 
data structures of computer science, like heaps and binomial queues. 

The correspondence (Figure 14) is as follows: Given a permutation of a set written as a 
word, 0 = 0102...0n, factor it in the form 0 = oy - min(o) - or, with min(c) the smallest 
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FIGURE II.14. A permutation of size 7 and its increasing binary tree lifting. 


label value in the permutation, and oy, oR the factors left and right of min(a). Then the binary 
tree 3(c) is defined recursively in the format (root, left,right) by 


B(o) = (min(o), B(or),B(er)), = Ble) = & 


The empty tree (consisting of a unique external node of size 0) goes with the empty permutation 
e. Conversely, reading the labels of the tree in symmetric (infix) order gives back the original 
permutation. (The correspondence is described for instance in Stanley’s book [447, p. 23-25] 
who says that “it has been primarily developed by the French”, pointing at [219].) 

Thus, the family Z of binary increasing trees satisfies the recursive definition 


T={e}+ (Z°*IxT), 


which implies the nonlinear integral equation for the EGF 


I(z)=1 +f 1%ae 


0 


This equation reduces to I’(z) = I(z)? and, under the initial condition J(0) = 1, it admits the 
solution I(z) = (1 — z)~*. Thus I, = n!, which is consistent with the fact that there are as 
many increasing binary trees as there are permutations. 

The construction of increasing trees associated with permutations is instrumental in deriv- 
ing EGFs relative to various local order patterns in permutations. We illustrate its use here by 
counting the number of up-and-down (or zig-zag) permutations, also known as alternating per- 
mutations. The result, already mentioned in our Jnvitation chapter, was first derived by Désiré 
André in 1881 by means of a direct recurrence argument. 

A permutation 0 = 0102...0n is an alternating permutation if 


(53) 01 > 02<03 >04<:°:-, 
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so that pairs of consecutive elements form a succession of ups and downs; for instance, 


6 Xu 
or a 


eee ty Pra a oa 


Consider first the case of an alternating permutation of odd size. It can be checked that the 
corresponding increasing trees have no one—way branching nodes, so that they consist solely of 
binary nodes and leaves. Thus, the corresponding specification is 


JHZ+(Z°xT*T), 
so that : 
Ie=24 f J(t)? dt and F(z) =14 Ile). 
) 


The equation admits separation of variables, which implies (with J(0) = 0) 


“3 
J(z) =tan(z) = z+ 2a + 16-7 + 272 7 
The coefficients Jon+1 are known as the tangent numbers or the Euler numbers of odd index 
(EIS 4000182). 
Alternating permutations of even size defined by the constraint (53) and denoted by 7 can 


be determined from 


Zt 


J={}t+ (27 eT aD) 
since now all internal nodes of the tree representation are binary, except for the rightmost one 
that only branches on the left. Thus, TJ (2) = tan(z)J(z), and the EGF is 

1 2 4 6 8 
T(z) = =1415 +5 +615, + 1385 


cos(z) 6! gto 


where the coefficients J2,, are the secant numbers also known as Euler numbers of even index 
(EIS A000364). 2... cece eee cenne eee neeees END OF EXAMPLE II.17. 


Use will be made later in this book (Chapter II, p. 22) of this important tree 
representation of permutations as it opens access to parameters like the number of 
descents, runs, and (once more!) records in permutations. Analyses of increasing trees 
also inform us of crucial performance issues regarding binary search trees, quicksort, 
and heap-like priority queue structures [351, 434, 486, 488]. 
> IL.30. Combinatorics of trigonometrics. Interpret tan ~=,, tantan z, tan(e* — 1) as EGFs 
of combinatorial classes. dq 


EXAMPLEII.18. Increasing Cayley trees and regressive mappings. An increasing Cayley tree 
is a Cayley tree (i.e., it is nonplane and rooted) whose labels along any branch stemming from 
the root form an increasing sequence. In particular, the minimum must occur at the root, and 
no plane embedding is implied. Let K be the class of such trees. The recursive specification is 
now 


K = (2° « SET{K}). 
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FIGURE II.15. An increasing Cayley tree (left) and its associated regressive mapping 
(right). 


The generating function thus satisfies the functional relations 
KA = f cKO gt, Kz) = eK, 


0 


with K(0) = 0. Integration of K’e~* = 1 shows that 


K(z) = log and Kn =(n—-1)!. 


1-z 
Thus the number of increasing Cayley trees is (n—1)!, which is also the number of permutations 
of size n — 1. These trees have been studied by Meir and Moon [356] under the name of 
“recursive trees”, a terminology that we do not however retain here. 

The simplicity of the formula K, = (n—1)! certainly calls for a combinatorial interpreta- 
tion. In fact, an increasing Cayley tree is fully determined by its child parent relationship (Fig- 
ure 15). Otherwise said, to each increasing Cayley tree 7, we associate a partial map ¢ = ¢; 
such that @(7) = 7 iff the label of the parent of i is 7. Since the root of tree is an orphan, 
the value of #(1) is undefined, (1) =; since the tree is increasing, one has $(7) < i for 
alli > 2. A function satisfying these last two conditions is called a regressive mapping. The 
correspondence between trees and regressive mappings is then easily seen to be a bijective one. 

Thus regressive mappings on the domain [1..n] and increasing Cayley trees are equinu- 
merous, so that we may as well use K to denote the class of regressive mappings. Now, a regres- 
sive mapping of size n is evidently determined by a single choice for #(2) (since 6(2) = 1), 
two possible choices for ¢(3) (either of 1,2), and so on. Hence the formula 


Ke = 0322S 1) 


receives a natural interpretation. .................. 000s eee ee END OF EXAMPLE II.18. 


> IL31. Regressive mappings and permutations. Regressive mappings can be related directly 
to permutations. The construction that associates a regressive mapping to a permutation is 
called the “inversion table” construction; see [307, 434]. Given a permutation 0 = 01,...,0n, 
associate to it a function 7) = a, from [1..n] to [0..n — 1] by the rule 


W() = card {k <j | on > os}. 


The function wv is a trivial variant of a regressive mapping. <q 
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> I1.32. Rotations and increasing trees. An increasing Cayley tree can be canonically drawn 
by ordering descendants of each node from left to right according to their label values. The 
rotation correspondence (p. 69) then gives rise to a binary increasing tree. Hence, increasing 
Cayley trees and increasing binary trees are also directly related. Summarizing this note and 
the previous one, we have a quadruple combinatorial connection, 


Increasing Cayley tree = Regressive mappings = Permutations = Increasing binary trees, 


that opens the way to yet more permutation enumerations. <q 


II.7. Perspective 


Together with the previous chapter and Figure I.14, this chapter and Figure 16 
provide the basis for the symbolic method that is at the core of analytic combinatorics. 
The translations of the basic constructions for labelled classes to EFGs could hardly 
be simpler, but, as we have seen, they are sufficiently powerful to embrace numerous 
classical results in combinatorics, ranging from the birthday and coupon collector 
problems to graph enumeration. 

The examples that we have considered for second-level structures, trees, map- 
pings, and graphs lead to EGFs that are simple to express and natural to generalize. 
(Often, the simple form is misleading—direct derivations of many of these EGFs that 
do not appeal to the symbolic method can be rather intricate.) Indeed, the symbolic 
method provides a framework that allows us to understand the nature of many of these 
combinatorial classes. From there, numerous seemingly scattered counting problems 
can be organized into broad structural categories and solved in an almost mechanical 
manner. 

Again, the symbolic method is only half of the story (the “combinatorics” in 
analytic combinatorics), leading to EGFs for the counting sequences of numerous 
interesting combinatorial classes. While some of these EGFs lead immediately to ex- 
plicit counting results, others require the classical techniques in complex analysis and 
asymptotic analysis that are covered in Part B (the “analytic” part of analytic combi- 
natorics) to deliver asymptotic estimates. Together with these techniques, the basic 
constructions, translations, and applications that we have discussed in this chapter re- 
inforce the overall message that the symbolic method is a systematic approach that 
is successful for addressing classical and new problems in combinatorics, generaliza- 
tions, and applications. 

We have been focussing on enumeration problems—counting the number of ob- 
jects of a given size in a combinatorial class. In the next chapter, we consider how to 
extend the symbolic method to help analyse other properties of combinatorial classes. 

The labelled set construction and the exponential formula were recognized early by re- 
searchers working in the area of graphical enumerations [259]. Foata [217] proposed a detailed 
formalization in 1974 of labelled constructions, especially sequences and sets, under the names 
of partitional complex; a brief account is also given by Stanley in his survey [445]. This is par- 
allel to the concept of “prefab” due to Bender and Goldman [34]. The books by Comtet [98], 
Wilf [496], Stanley [447], or Goulden and Jackson [244] have many examples of the use of 
labelled constructions in combinatorial analysis. 

Greene [251] has introduced a general framework of “labelled grammars” largely based 
on the boxed product with implications for the random generation of combinatorial structures 
in his 1983 dissertation. Joyal’s theory of species dating from 1981 (see [286] for the original 
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1. The main constructions of union, and product, sequence, set, and cycle for labelled 
structures together with their translation into exponential generating functions. 


Construction 

Union A=B+C 
Product A=BxC 
Sequence A = SEQ{B} 
Set A = SET{B} 
Cycle A = Cyc{B} 


2. The translation for sets, multisets, and cycles of fixed cardinality. 


Construction 


Sequence A = SEQ, {B} | A(z) = B(z)* 
Set 


Cycle 


3. The additional constructions of pointing and substitution. 
Construction 
Pointing 


Substitution 


FIGURE II.16. A “dictionary” of /abelled constructions together with their translation 
into exponential generating functions (EGFs). The first constructions are counterparts of 
the unlabelled constructions of the previous chapter (the multiset construction is not mean- 
ingful here). The translation for composite constructions of bounded cardinality appears 
to be simple. Finally, the boxed product is specific to labelled structures. (Compare with 
the unlabelled counterpart, Figure 14 of Chapter I, p. 14.) 


article and the book by Bergeron, Labelle, and Leroux [39] for a rich exposition), is based on 
category theory; it presents the advantage of uniting in a common theory the unlabelled and the 
labelled worlds. 
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Flajolet, Salvy, and Zimmermann have developed a specification language closely related 
to the system exposed here. They show in [206] how to compile automatically specifications 
into generating functions; this is complemented by a calculus that produces fast random gener- 


ation algorithms [216]. 


Combinatorial Parameters and 
Multivariate Generating Functions 


Generating functions find averages, etc. 
— HERBERT WILF [496] 


Je n’ai jamais été assez loin pour bien sentir Il’application de I’algébre a la géométrie. Je 
n’aimais point cette maniére d’opérer sans voir ce qu’on fait, et il me sembloit que résoudre un 
probléme de géométrie par les équations, c’étoit jouer un air en tournant une manivelle’. 

— JEAN-JACQUES ROUSSEAU, Les Confessions, Livre VI 
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Many scientific endeavours demand precise quantitative information on probabilis- 
tic properties of parameters of combinatorial objects. For instance, when designing, 
analysing, and optimizing a sorting algorithm, it is of interest to determine what the 
typical disorder of data obeying a given model of randomness is, and do so in the 
mean, or even in distribution, either exactly or asymptotically. Similar situations arise 
in a broad variety of fields, including probability theory and statistics, computer sci- 
ence, information theory, statistical physics, and computational biology. The exact 
problem is then a refined counting problem with two parameters, namely, size and 
additional characteristic: this is the subject addressed in this chapter and treated by a 
natural extension of the generating function framework. (The asymptotic problem can 
be viewed as one of characterizing in the limit a family of probability laws indexed 
by the values of the possible sizes: this is a topic to be discussed in Chapter IX.) 
As demonstrated here, the symbolic methods initially developed for counting com- 
binatorial objects adapt gracefully to the analysis of various sorts of parameters of 
constructible classes, unlabelled and labelled alike. 


l«T never went far enough to get a good feel for the application of algebra to geometry. I was not 
pleased with this method of operating according to the rules without seeing what one does; solving geomet- 
rical problems by means of equations seemed like playing a tune by turning a crank.” 
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Multivariate generating functions (MGFs)—ordinary or exponential—can keep 
track of a collection of parameters defined over combinatorial objects. From the 
knowledge of such generating functions, there result either explicit probability dis- 
tributions or, at least, mean and variance evaluations. For inherited parameters, all 
the combinatorial classes discussed so far are amenable to such a treatment and tech- 
nically, the translation schemes that relate combinatorial constructions and multivari- 
ate generating functions present no major difficulty—they appear to be natural (no- 
tational, even) refinements of the paradigm developed in Chapters I and II for the 
univariate case. Typical applications from classical combinatorics are the number of 
summands in a composition, the number of blocks in a set partition, the number of 
cycles in a permutation, the root degree or path length of a tree, the number of fixed 
points in a permutation, the number of singleton blocks in a set partition, the number 
of leaves in trees of various sorts, and so on. 

Beyond its technical aspects anchored in symbolic methods, this chapter also 
serves as a first encounter with the general area of random combinatorial structures. 
The general question is: What does a random object of large size look like? Multi- 
variate generating functions first provide an easy access to moments of combinatorial 
parameters—typically the mean and variance. In addition, when combined with basic 
probabilistic inequalities, moment estimates often lead to precise characterizations of 
properties of large random structures that hold with high probability. For instance, a 
large integer partition conforms with high probability to a deterministic profile, a large 
random permutation almost surely has at least one long cycle and a few short ones, and 
so on. Such a highly constrained behaviour of large objects may in turn serve to design 
dedicated algorithms and optimize data structures; or it may serve to build statistical 
tests—when does one depart from randomness and detect a “signal” in large sets of 
observed data? Randomness aspects form a recurrent theme of the book: they will be 
developed much further in Chapter IX, where complex-asymptotic methods of Part B 
are grafted on the exact modelling by multivariate generating functions presented in 
this chapter. 


This chapter is organized as follows. First a few pragmatic developments re- 
lated to bivariate generating functions, the multivariate paradigm specialized to two 
variables, are presented in Section II.1. Section IH.2 then presents the notion of 
bivariate enumeration and its relation to discrete probabilistic models, including the 
determination of moments, as the language of elementary probability theory does pro- 
vide an intuitively appealing way to conceive of bivariate counting data. The sym- 
bolic method per se declined in its general multivariate version is centrally developed 
in Sections III. 3 and III. 4: with suitable multi-index notations, the extension of the 
symbolic method to the multivariate case is almost immediate. Recursive parame- 
ters that often arise in particular from tree statistics form the subject of Section III. 5, 
while complete generating functions and associated combinatorial models are dis- 
cussed in Section III. 6. Additional constructions like pointing, substitution, and or- 
der constraints lead to interesting developments, in particular, an original treatment 
of the inclusion-exclusion principle in Section III. 7. The chapter concludes with Sec- 
tion III. 8, which presents a brief abstract discussion of extremal parameters like height 
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in trees or smallest and largest components in composite structures— such parameters 
are best treated via families of univariate generating functions. 


II.1. An introduction to bivariate generating functions (BGFs) 


We have seen in Chapters I and II that a number sequence (f,,) can be encoded 
by means of a generating function in one variable, either ordinary or exponential: 


be fnz” ordinary GF 


(fn) _ F(z) = S- ny exponential GF. 


This encoding is powerful, since many combinatorial constructions admit of a trans- 
lation as operations over such generating functions. In this way, one gains access to 
many useful counting formule. 

Similarly, consider a sequence of numbers (f,,,x,) depending on two integer valued 
indices, n and k&. Usually, in this book, (fn,~) will be an array of number (often 
a triangular array), where f;,,, is the number of objects y in some class F, such 
that |p| = n and some parameter x(y) is equal to &. We can encode this sequence 
by means of a bivariate generating function (BGF), which involves two variables, z 
attached to n and u attached to k. 


Definition III.1. The bivariate generating functions (BGFs), either of the ordinary 
or exponential type, of an array (fn,~) are the formal power series f(z, u) in two 
variables defined by 


Se ordinary BGF 
n,k 


Chie) ~ f(z,u) = a”, 

S frik —u exponential BGF. 
nn! 

n,k 


(The case of a “double exponential” GF corresponding to nas is not used in the 
book.) 

As we shall see shortly, many parameters of constructible classes become acces- 
sible through such BGFs. According to the point of view adopted momentarily here, 
one starts with an array of numbers and forms a BGF by a double summation pro- 
cess. We present here two examples related to binomial coefficients and Stirling cycle 
numbers illustrating how such BGFs can be determined, then manipulated. In what 
follows it is convenient to refer to the horizontal and vertical generating functions that 
are each a one-parameter family of GFs in a single variable defined by 


horizontal GF: f,(w) := S- eae’: 
k 


vertical GF: f'*(z) := S- fn,nz” (ordinary case) 


f (2) = ys fa (exponential case). 


n 
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FIGURE III.1. An array of numbers and its associated horizontal and vertical GFs. 


The terminology is transparently explained if the elements (f,,,,,) are arranged as an 
infinite matrix, with f,,, placed in row n and column k, since the horizontal and verti- 
cal GFs appear as the GFs of the rows and columns respectively (Figure 1). Naturally, 
one has 


S- fn(u)z” — ordinary BGF 


= . . 
k S- fr(w exponential BGF. 


EXAMPLE III.1. The BGF of binomial coefficients. The binomial coefficient (i); counts the 
binary words of length n having k occurrences of a designated letter; see Figure 2. In order to 
compose the bivariate GF, start from the simplest case of Newton’s binomial theorem and form 
directly the horizontal GFs corresponding to a fixed n: 


(1) W,.(u) = (i) =(1+u)", 


k=0 


Then a summation over all values of n gives the ordinary BGF 


k,n>0 n>0 


Such calculations are typical of BGF manipulations. What we have done amounts to starting 
from a sequence of numbers, determining the horizontal GFs W,,(u) in (1), then the bivariate 
GF W (z, u) in (2), according to the scheme: 


Wake ~ Wrlu) ~ Wz,u). 


Observe that (2) reduces to the OGF (1 — 2z)~’ of binary words, as it should, upon setting 
u=l1. 

In addition, one can deduce from (2) the vertical GFs of the binomial coefficients corre- 
sponding to a fixed value of k, 
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FIGURE III.2. The set Ws of the 32 binary words over the alphabet {L1, ll} enumerated 
according to the number of occurrences of the letter ‘I’ gives rise to the bivariate counting 
sequence {Ws ;} = 1,5, 10, 10,5, 1. 


from an expansion of the BGF with respect to u, 


Zk 


1 
(3) WOR) ie ee 


k>0 


and the result naturally matches what a direct calculation would give. END OF EXAMPLE III.1. 


> IIL.1. The exponential BGF of binomial coefficients. It is 


(4) W(z,u) = se (;) he _ Si(1 wy = e(ltu) 


k,n 


The vertical GFs are e* z* /k!. The horizontal GFs are (1 + u)”, like in the ordinary case. <J 
EXAMPLE III.2. The BGF of Stirling cycle numbers. As seen in Chapter II Example 12, 


the number of permutations of size n having k cycles is the Stirling cycle number (A with a 
vertical EGF being 


PO (2) = A Be ams L(z) == log = z 


n! —2Z 


From there, the exponential BGF is formed as follows (this revisits some of the calculations on 
p. 112): 


P(z,u) = Do PM@u® = Soap) = et) 
(5) r a 
= (l-—2z)™. 


The simplification is quite remarkable but altogether quite typical, as we shall see shortly, in the 
context of a labelled set construction. The starting point is thus a collection of vertical EGFs 
and the scheme is now 


PS?) x Pz) A P(z,u). 


Observe that (5) reduces to the EGF of permutations at u = 1. 
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Numbers Horizontal GFs Horizontal GFs 


(1+u)” u(ut+1)---(u+tn-—1) 


Exponential BGF 
k 


(Lege 


FIGURE III.3. The various GFs associated to binomial coefficients (left) and Stirling 
cycle numbers (right). 


In addition, an expansion of the BGF according to the variable z provides a useful infor- 
mation, namely, the horizontal GFs by virtue of Newton’s binomial theorem: 


Powis Se 2 Yo Paw) 


(6) n>=0 n>0 
where Pr(u) = u(ut1)---(ut+n—1). 
This last polynomial is called the Stirling cycle polynomial of index n and it describes com- 


pletely the distribution of the number of cycles in all permutations of size n. In addition, note 
that the relation 


Pa(u) = Pn-1(u)(u+ (n— 1)), 


Sie bacon 


by which Stirling numbers are often defined and easily evaluated numerically; see also AP- 


is equivalent to a recurrence 


PENDIX A: Stirling numbers, p. 680. (The recurrence is susceptible to a direct combinatorial 
interpretation—add n either to an existing cycle or as a “new” singleton.) END OF EXAMPLE III.2. 


Concise expressions for BGFs like (2), (3), (5), or (17) summarized in Figure 3 
are precious for deriving moments, variance, and even finer characteristics of distri- 
butions, as we see next. The determination of such BGFs can be covered by a simple 
extension of the symbolic method along the lines of what was done in Chapters I 
and II, as detailed in Sections III. 3 and III. 4. 


Ill. 2. Bivariate generating functions and probability distributions 


Our purpose in this book is to analyse characteristics of combinatorial structures 
of very diverse types. We shall be principally interested in enumeration according to 
size and an auxiliary parameter, the corresponding problems being naturally treated 
by means of BGFs. In order to avoid redundant definitions, it proves convenient to 
introduce the sequence of fundamental factors (wn )n>0, defined by 


(7) Wn =1 for ordinary GFs, Wy =n! for exponential GFs. 
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Then, the OGF and EGF of a sequence (f,,) are jointly represented as 


(= ad fa =onle"F2). 


n 


Definition III.2.. Given a combinatorial class A, a (scalar) parameter is a function 
from A to Zso that associates to any object a € A an integer value x(a). The 
sequence 


Ann =card({ae A | jal =n, x(a) = k}), 


is called the counting sequence of the pair A,x. The bivariate generating function 
(BGF) of A, x is defined as 


A(z, u) = S- Ane——ut, 


n,k>0 


and is of ordinary type if w, = 1 and of exponential type if w, = n!. One says that 
the variable z marks size and the variable u marks the parameter x. 

Naturally A(z, 1) reduces to the usual counting generating function A(z) associ- 
ated to A, and the cardinality of A, is expressible as 


An = Wn[2”] A(z, 1). 


Ill. 2.1. Distributions and moments. As indicated in the introduction to this 
chapter, the eventual goal of multivariate enumeration is the quantification of prop- 
erties present with high regularity in large random structures. Within this section, 
we discuss the relationship between probabilistic models needed to interpret bivari- 
ate counting sequences and bivariate generating functions. The elementary notions 
needed are recalled in APPENDIX A: Combinatorial probability, p. 671. 


Consider a combinatorial class A. The uniform probability distribution over Ay, 
assigns to any a € A, a probability equal to 1/A,,. We shall use the symbol P to 
denote probability and occasionally subscript it with an indication of the probabilistic 
model used, whenever this model needs to be stressed: we shall then write P4,, (or 
simply P,, if A is understood) to indicate probability relative to the uniform distribu- 
tion over A,,. 


Probability generating functions. Consider a parameter y. It determines over 
each A,, a discrete random variable defined over the discrete probability space A,: 
An, k An, k 


(8) Pa, {x =k} = A SA: 
n k 4n, 


Given a discrete random variable X, we recall that its probability generating function 
(PGF) is the quantity 


(9) p(u) = 5 P(X =k)ut, 
k 


a generating function whose coefficients are probabilities. From (8) and (9), one has 
immediately: 


146 III. PARAMETERS AND MULTIVARIATE GFS 


0.4 2 
0.08 
0.15 
0.06 
0.1 
0.04 
0.02 0.05 
0" 10 20 30 40 50 0 10 20 30 40 50 


FIGURE III.4. Histograms of two combinatorial distributions. Left: the number of 
occurrences of a designated letter in a random binary word of length 50 (binomial distri- 
bution). Right: the number of cycles in a random permutation of size 50 (Stirling cycle 
distribution). 


Proposition III.1 (PGFs from BGFs). Let A(z, u) be the bivariate generating func- 
tion of a parameter x defined over a combinatorial class A. The probability generat- 
ing function of x over Ay, is given by 


_ nye = AG) 
DPan(x > k) [2"] A(z, 1)’ 


and is thus a normalized version of a horizontal generating function. 


The translation into the language of probability enables us to make use of which- 
ever intuition might be available in any particular case, while allowing for a nat- 
ural interpretation of data (Figure 4). Indeed, instead of noting that the quantity 
381922055502195 represents the number of permutations of size 20 that have 10 
cycles, it is perhaps more informative to state the probability of the event, which is 
0.00015, i.e., about 1.5 per ten thousand. Discrete distributions are conveniently rep- 
resented by histograms or “bar charts”, where the height of the bar at abscissa / indi- 
cates the value of P{ X = k}. Figure 4 displays in this way two classical combinatorial 
distributions. Given the uniform probabilistic model that we have been adopting, such 
histograms are eventually nothing but a condensed form of the “stacks” corresponding 
to exhaustive listings, like the one displayed in Figure 2. 


Moments. Important information is conveyed by moments. Given a discrete ran- 
dom variable X, the expectation of f(X) is by definition the linear functional 


E(f(X)) = DIP{X =k} f(h). 
k 


The (power) moments are 


(X") = SOP{X =k} -k’. 
k 


Then the expectation (or average, mean) of X, its variance, and its standard deviation 
are expressed as 


(X), V(X) =E(X’)-E(X)*,  o(X) = VV(X). 
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The expectation corresponds to what is typically seen when forming the arithmetic 
mean value of a large number of observations: this property is the weak law of large 
numbers [161, Ch X]. The standard deviation then measures the dispersion of values 
observed from the expectation and it does so in a mean-quadratic sense. 

The factorial moment defined for order r as 


E(X(X —1)---(X —r+1)) 
is also of interest for computational purposes, since it is obtained plainly by differen- 
tiation of PGFs (APPENDIX A: Combinatorial probability, p. 671). Power moments 
are then easily recovered as linear combinations of factorial moments, see Note 7 of 
Appendix A. In summary: 
Proposition III.2 (Moments from BGFs). The factorial moment of order r of a pa- 


rameter x is determined from the BGF A(z, u) by r-fold differentiation followed by 
specialization at 1: 


Ma XS eee) = 


In particular, the first two moments satisfy 


A Gj 2 eee whe J 02) = ZAG, Dlyar 
7 [z™JA(z,1) ‘An [2e"JA(z, 1 


| | 
+ ie eee ee a 
) [2"]A(z, 1) 
the variance and standard deviation being the determined by 
V(x) = a(x)” = E(x”) — E(x)’. 

PROOF. The PGF p,,(u) of x over A, is given by Proposition III.1. On the other hand, 
factorial moments are on general grounds obtained from a PGF by differentiation and 
specialization at u = 1 (APPENDIX A: Combinatorial probability, p. 671). The result 


follows. 
In other words, the quantities 


OP) = wn» ([2"] A(z, 4)|,1) 


give, after a simple normalization (by w,, - [z"] A(z, 1)), the factorial moments: 


Boxe =e (= b+ D) = 


Qi), 


Most notably, Q is the cumulated value of x over all objects of A,: 


O09 = wn: [2"] WA, Wun = DS x(@) = An Ex, (x). 
acAn 


Accordingly, the GF (ordinary or exponential) of the a is sometimes named the cu- 


mulative generating function. It can be viewed as an unnormalized generating function 
of the sequence of expected values. These considerations explain Wilf’s suggestive 
motto quoted on p. 139: 

“Generating functions find averages, etc.” 


The “etc” is to be interpreted as a token for higher moments. 
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> III.2. A combinatorial form of cumulative GFs. One has 


2% (z) =) 7 Ea, (x) An = >) x(a) 


acA 


zlel 


? 
Wla.| 
where wp, = 1 (ordinary case) or wy, = n! (exponential case). J 
EXAMPLE III.3. Moments of the binomial distribution. The binomial distribution of index n 


can be defined as the distribution of the number of a’s in a random word of length n over the 
binary alphabet {a, b}. The determination of moments results easily from the ordinary BGF, 


1 


l—z-—zu 


W(z,u) = 


By differentiation, one finds 


or riz” 
Bur V4) = Gaye 


u=1 
Coefficient extraction then gives the form of the factorial moments of orders 1, 2,3,...,7 as 


n n(in-1) n(n—1)\(n—2) us (") 


a) 4.” 8 pO) ~ OF 


In particular, the mean and the variance are zn and in. The standard deviation is thus svn 
which is of an order much smaller than the mean: this indicates that the distribution is some- 
how concentrated around its mean value, as suggested by Figure 4; see the next subsection for 


quantitative estimates. ........ 0... cece cece eee ee eee END OF EXAMPLE III.3. 


> IIL.3. De Moivre’s approximation of the Gaussian coefficients. The fact that the mean and 
the standard deviation of the binomial distribution are respectively zn and 3 m suggests an 
examination of what goes on at a distance of x standard deviations from the mean. Consider for 
simplicity the case of n = 2v even. From the ratio 

2v 1 2 k-1 
Ge). Al ae es ea Se) 


PY) (14+ 4)0+2)---+ 4) 


iz 


r(v,£) = 
an estimate of the logarithm shows that for any fixed x € R, 


lim Cay = ene /2 


2v 
noo, l=v+a,/v/2 ( oe ) 
(Alternatively, Stirling’s formula can be employed.) This Gaussian approximation for the bi- 
nomial distribution was first discovered in 1733 by Abraham de Moivre (1667-1754), a close 
friend of Newton. Much more general methods for establishing such approximations form the 
subject of Chapter IX. dq 


EXAMPLE III.4. Moments of the Stirling cycle distribution. Let us return to the example of 
cycles in permutations which is of interest in connection with certain sorting algorithms like 
bubble sort or insertion sort, maximum finding, and in situ rearrangement [301]. 

We are dealing with labelled objects, hence exponential generating functions. As seen 
earlier on p. 143, the BGF of permutations counted according to cycles is 


P(z,u) =(1-—2z)™. 
We have P,, = n!, while w,, = n! since the BGF is exponential. (The number of permutations 


of size n being n!, the combinatorial normalization happens to coincide with the factor of 1/n! 
present in all exponential generating functions.) 
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By differentiation of the BGF with respect to u, then setting u = 1, we next get the 
expected number of cycles in a random permutation of size n as a Taylor coefficient 


: log i stp tet esa 

1l-z 1-2z 2 n’ 

which is the harmonic number H,,. Thus, on average, a random permutation of size n has about 
logn + ¥ cycles, a well known fact of discrete probability theory, derived on p. 112 by means 
of horizontal generating functions. 


For the variance, a further differentiation of the bivariate EGF gives 


eee i. \2 
(11) 35 En(x(x — 1))2 =i (ee 4) . 


(10) En(x) = [2"] 


From this expression and Note 4 (or directly from the Stirling polynomials), a calculation shows 
that 


od oS I re 1 
(12) a= (Sz) - (Se) =e =+0(Z). 


Thus, asymptotically, 


On ~ vVlogn. 
The standard deviation is of an order smaller than the mean, and therefore deviations from the 
mean have an asymptotically negligible probability of occurrence (see below the discussion of 
moment inequalities). Furthermore, the distribution was proved to be asymptotically Gaussian 
by V. Gonéarov, around 1942, see [240] and Chapter IX. ....... END OF EXAMPLE III.4. 


> IIL4. Stirling cycle numbers and harmonic numbers. By the “exp-log trick” of Chapter I, 
the PGF of the Stirling cycle distribution satisfies 


1 2 3°. om, 

—u(u+1)---(u+n—1) =exp (on, SH? + H+.) : u=1+v 

n! 
where H’ is the generalized harmonic number wie j_". Consequently, any moment of 
the distribution is a polynomial in generalized harmonic numbers, cf (10) and (12). Also, the 
kth moment satisfies Ep, (x*) ~ (logn)*. (The same technique expresses the Stirling cycle 


number [2 


Alternatively, start from the expansion of (1 — z)~~ and differentiate repeatedly with 
respect to a; for instance, one has 


= 1 1 1 1 nta-1l 
PaaS] = ae ee ee n 
we) aces Oe" Beer ta) n Je. 


which provides (10) upon setting a = 1, while the next differentiation gives access to (12). < 


| as a polynomial in generalized harmonic numbers H,.) 


The situation encountered with cycles in permutations is typical of iterative (non— 
recursive) structures. In many other cases, especially when dealing with recursive 
structures, the bivariate GF may satisfy complicated functional equations in two vari- 
ables (see the example of path length in trees, Section III. 5 below) that do not make 
them available under an explicit form. Thus, exact expressions for the distributions 
are not always available, but asymptotic laws can be determined in a large number of 
cases (Chapter IX). In all cases, the BGFs are the central tool in obtaining mean and 
variance estimates, since their derivatives instantiated at uw = 1 become univariate GFs 
that usually satisfy much simpler relations than the BGFs themselves. 
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Ill. 2.2. Moment inequalities and concentration of distributions. Qualitative- 
ly speaking, families of distributions can be classified into two categories: (7) distri- 
butions that are spread, i.e., the standard deviation is of order at least as large as the 
mean (e.g.the uniform distributions over [0 ..n], which have totally flat histograms, 
are spread); (27) distributions such that the standard deviation is of an order smaller 
than the mean. Figure 4 illustrates the phenomena at stake and suggests that both 
the Stirling cycle distributions and the binomial distributions belong to the second 
category and are somehow concentrated around their mean value. Such informal ob- 
servations are indeed supported by the Markov-Chebyshev inequalities, which take 
advantage of information provided by the first two moments. (A proof is found in 
APPENDIX A: Combinatorial probability, p. 671.) 


Markov-Chebyshey inequalities. Let X be a nonnegative random variable 
and Y anarbitrary real variable. One has for any t > 0: 


P{X > tE(X)} < 


(Markov inequality) 


1 
t 
1 


P{|Y —E(Y)| > to(Y)} < 


(Chebyshev inequality). 


t2 
This result informs us that the probability of being much larger than the mean must 
decay (Markov) and that an upperbound on the decay is measured in units given by 
the standard deviation (Chebyshev). 

The next proposition formalizes a concentration property of distributions. It ap- 
plies to a family of distributions indexed by the integers. 
Proposition II.3 (Concentration of distribution). Consider a family of random vari- 
ables Xy, typically, a scalar parameter x on the subclass A, Assume that the means 
Ln, = E(X,,) and the standard deviations o7, = 0(X»,) satisfy the condition 


Then the distribution of X,, is concentrated in the sense that, for any « > 0, there 
holds 


n—+0o Ln 


PROOF. It is a direct consequence of Chebyshev’s inequality. 
The concentration property (13) expresses the fact that values of X,, tend to be- 

come closer and closer (in relative terms) to the mean Z,, as n increases. Another figu- 
rative way to describe concentration, much used in random combinatorics, is by saying 
that “X,,/{1n tends to I in probability’. When this property is satisfied, the expected 
value is in a strong sense a typical value. This fact is an extension of the weak law of 
large numbers of probability theory. In that field, the concentration property (13) is 
also known as convergence in probability and is then written more concisely: 

Xn PB 

—— 1. 


Ln 


Xn 
(13) lim Pit-exs#citebat, 
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FIGURE III.5. Plots of the binomial distributions for n = 5,...,50. The horizon- 
tal axis is normalized (by a factor of 1/n) and rescaled to 1, so that the curves display 
{P(<= = x)}, forz =0,+, 3 


Imo nmrirte 


Concentration properties of the binomial and Stirling cycle distributions. The 
binomial distribution is concentrated, since the mean of the distribution is n/2 and 
the standard deviation is \/n/4, a much smaller quantity. Figure 5 illustrates con- 
centration by displaying the graphs (as polygonal lines) associated to the binomial 
distributions for n = 5,...,50. Concentration is also quite perceptible on simula- 
tions as n gets large: the table below describes the results of batches of ten (sorted) 
simulations from the binomial distribution { oa ( a) tae 

n = 100 39, 42, 43, 49, 50, 52, 54, 55, 55, 57 

n = 1000 487, 492, 494, 494, 506, 508, 512, 516, 527, 545 

n= 10,000 | 4972, 4988, 5000, 5004, 5012, 5017, 5023, 5025, 5034, 5065 

n = 100, 000 | 49798, 49873, 49968, 49980, 49999, 50017, 50029, 50080, 50101, 50284; 


the maximal deviations from the mean observed on such samples are 22% (n = 107), 
9% (n = 10%), 1.3% (n = 104), and 0.6% (n = 10°). 

Similarly, the mean and variance computations of (10) and (12) imply that the 
number of cycles in a random permutation of large size is concentrated. 


Finer estimates on distributions form the subject of our Chapter IX dedicated 
to limit laws. The reader may get a feeling of some of the phenomena at stake 
when re-examining Figure 5: the visible emergence of a continuous curve (the bell 
shaped curve) corresponds to a common asymptotic shape for the whole family of 
distributions—the Gaussian law. 


Ill. 3. Inherited parameters and ordinary MGFs 


We have seen so far basic manipulations of BGFs (Section II. 1) as well as their 
use in order to determine moments of combinatorial distributions (Section HI. 2). In 
this section and its labelled counterpart, Section III. 4, we address the question of de- 
termining directly BGFs from combinatorial specifications. The answer is provided 
by a simple extension of the symbolic method, which is formulated in terms of multi- 
variate generating functions (MGFs). Such generating functions have the capability of 
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taking into account a finite collection (equivalently, a vector) of combinatorial parame- 
ters. On the one hand, the theory specializes immediately to BGFs, which correspond 
to the particular case of a single (scalar) parameter. On the other hand, it provides 
“complete” (multivariate) generating functions discussed in Section III. 6. 


Il. 3.1. Multivariate generating functions (MGFs). The theory is best devel- 
oped in full generality for the joint analysis of a fixed finite collection of parameters. 


Definition III.3. Consider a combinatorial class A. A (multidimensional) parameter 
X = (X1,---) Xa) on the class is a function from A to the set Ze, of d-tuples of natural 
numbers. The counting sequence of A with respect to size and the parameter x is then 
defined by 


yeoka = card {a | la] =n, x1(a) = ky,...,xXa(@) = ka} . 


We sometimes refer to such a parameter as a “multiparameter” when d > 1, and 
a “simple” or “scalar” parameter otherwise. For instance, one may take the class P 
of all permutations o, and for x; (7 = 1, 2,3) the number of cycles of length 7 in o. 
Alternatively, we may consider the class W of all words w over an alphabet with four 
letters, {a1,..., a4} and take for x; (j = 1,...,4) the number of occurrences of the 
letter a; in w, and so on. 

The multi-index convention employed in various branches of mathematics greatly 


simplifies notations: let u = (w1,..., a) be a vector of d formal variables and k = 
(k1,...,ka) be a vector of integers of the same dimension; then, the multi-power ux 
is defined as the monomial 

(14) uk :— uk un? . vale 


With this notation, we have: 


Definition II.4. Let A, be a multi-index sequence of numbers, where k € N‘. 
The multivariate generating function (MGF) of the sequence of either ordinary or 
exponential type is defined as the formal power series 


A(z,u) = ys An,u*2" (ordinary MGF) 
(15) is -. . 
A(z,u) = 2 Ant 7) (exponential MGF). 


Given a class A and a parameter x, the multivariate generating function (MGF) 
of the pair (A, x) is the MGF of the corresponding counting sequence. In particular, 
one has the combinatorial forms 


A(z,u) = ye ux) zl¢l_ (ordinary MGF; unlabelled case) 
acA 
(16) . sel 
A(z,u) = ys: al (exponential MGF; labelled case). 
al! 
acA 


One also says that A(z,u) is the MGF of the combinatorial class with the formal 
variable u; marking the parameter x; and z marking size. 
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From the very definition, A(z, 1) (with 1 a vector of all 1’s) coincides with the 
counting generating function of A, either ordinary or exponential as the case may be. 
One can then view an MGF as a deformation of a univariate GF by way of a (vector) 
parameter u, with the property for the multivariate GF to reduce to the univariate 
counting GF at u = 1. If all but one of the wu; are set to 1, then a BGF results. 
Thus, the symbolic calculus that we are going to develop opens full access to BGFs 
and hence moments. In fact, it has the capacity of determining the joint probability 
distribution of a finite collection of parameters. 


> III.5. Specializations of MGFs. The exponential MGF of permutations with wi, w2 marking 
the number of 1-cycles and 2-cycles respectively turns out to be 


exp ((u —1)z+(uw- N=) 

1-—z : 
(This is to be proved later in this chapter, p. 175.) The formula is checked to be consistent with 
three already known specializations derived in Chapter II: (z) setting w1 = u2 = 1 gives back 
the counting of all permutations, P(z, 1,1) = (1 — z)~", as it should; (i) setting ui = 0 and 
u2 = 1 gives back the EGF of derangements, namely e~* /(1 — z); (4i2) setting ur = u2 = 0 
gives back the EGF of permutations with cycles all of length greater than 2, P(z,0,0) = 


(17) P(z,u1, U2) = 


gee? /(1 — z), a generalized derangement GF. In addition, the specialized BGF 


5 i e(u-hez 

(2, U, ) — 1 —2z ) 

enumerates permutations according to singleton cycles. This last BGF interpolates between the 
EGF of derangements (u = 0) and the EGF of all permutations (u = 1). J 


Il. 3.2. Inheritance and MGFs. Parameters that are inherited from substruc- 
tures can be taken into account by a direct extension of the symbolic method. With 
a suitable use of the multi-index conventions, it is even the case that the translation 
rules previously established in Chapters I and II can be copied verbatim. This ap- 
proach opens the way to a large quantity of multivariate enumeration results that then 
follow automatically by the symbolic method. 


Let us consider a pair (A, x), where A is a combinatorial class endowed with its 
usual size function | - | and y = (x1,-.--, Xa) is a d-dimensional (multi)parameter. 
Write yo for size and zo for the variable marking size (previously denoted by 2). 
The key point for theoretical developments is to define an extended multiparameter 
X = (Xo, N1;---;Xa), that is, we treat size and parameters on an equal basis. Then 
the ordinary MGF in (15) assumes an extremely simple and symmetrical form: 


18 
as) = > Xo) 
acA 
There, the indeterminates are the vector z = (20, 21,.-., 2a), the indices are k = 
(ko, ki,..., ka) (where ko indexes size, previously denoted by n), and the usual multi- 


index convention introduced in (14) is in force, 


(19) zk := zion pegahe: 
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but it is now applied to (d + 1)-dimensional vectors. 
Next, we define inherited parameters. 


Definition II.5. Lert (A, x), (B,€), (C,¢) be three combinatorial classes endowed 
with parameters of the same dimension d. The parameter x is said to be inherited in 
the following cases: 


e Disjoint union: when A = B +C, the parameter x is inherited from €, ¢ iff 
its value is determined by cases from €, ¢: 


Ew) ifweB 
C(w) ifwec. 


e Cartesian product: when A = B x C, the parameter x is inherited from €, ¢ 
iff its value is obtained additively from the values of €,¢: 


x((B, 1) = &(8) + C(7): 


e Composite constructions: when A = R{B}, where R is a metasymbol 
representing any of SEQ, MSET, PSET, CYC, the parameter y is inherited 
from € iffits value is obtained additively from the values of € on components; 
for instance, for sequences: 


X([G1,-++5 Pr]) = (G1) +--+ + €(Gr). 


With a natural extension of the notation used for constructions, one shall write 


(A, x) (B, €) a (C,¢), (A, x) 7 (B, €) x (C,¢), (A, x) a RAAB) 


This definition of inheritance is seen to be a natural extension of the axioms that 
size itself has to satisfy (Chapter I): size of a disjoint union is defined by cases, while 
size of a pair, and similarly of a composite construction, is obtained by addition. 


(w) = 


Theorem III.1 (Inherited parameters and ordinary MGFs). Let A be a combinatorial 
class constructed from B,C, and let x be a parameter inherited from € defined on 
B and (as the case may be) from ¢ on C. Then the translation rules of admissible 
constructions stated in Theorem I.1 apply provided the multi-index convention (18) is 
used. The associated operators on ordinary MGFs are then: 


Union: A=B+C = A(z) = Biz) +C(z) 
Product: A=BxC 


=> 
Sequence: A= SEQ(B) = A(z)= 
——7 


1— B(z) 
Cycle: A = Cyc(B) A(z) = wo) log 1- aa 
f=1 


Multiset: A=MSeET(B) = > A(z) =exp 


Powerset! A= PSET(B) = > A(z) =exp 
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PROOF. The verification for sums and products is immediate, given the combinatorial 
forms of OGFs. For disjoint unions, one has 


A(z) = S* 2X) = S$* 8) 4 S26), 


acA BEB yEC 


as results from the fact that inheritance is defined by cases on unions. For cartesian 


products, one has 
A(z) = S- gx) — S- 286) ye zs) 
acA BEB vec 


as results from the fact that inheritance is defined additively on products. 

The translation of composite constructions in the case of sequences, powersets, 
and multisets are then built up from the union and product schemes, in exactly the 
same manner as in the proof of Theorem I.1. Cycles are dealt with by the methods of 
APPENDIX A: Cycle construction, p. 674. 

This theorem is a straightforward extension of the symbolic method, but it is im- 
portant because it can be applied in a wide range of combinatorial applications. The 
reader is especially encouraged to study carefully the treatment of integer composi- 
tions below, as it illustrates in its bare bones version the power of the symbolic method 
for taking into account combinatorial parameters. 

The multi-index notation is a crucial ingredient for developing the general theory 
of multivariate enumerations. However, in most cases, we work with only a small 
number of parameters, typically one or two. In such cases, we often use vectors of 
variables like (z, u) or (z, u,v), the corresponding monomials being then written as 
zu or z”uPv®. This has the advantage of avoiding unnecessary subscripts. 


Integer compositions and marks. The class C of all integer compositions (Chap- 

ter I) is specified by 
C=SEQ(Z),  T=SEQ>1(Z), 
where Z is the set of all positive numbers. The corresponding OGFS are 
He gees 
1—I(z) 

so that C;,, = 2"~! (n > 1). Say we want to enumerate compositions according 
to the number x of summands. One way to proceed, in accordance with the formal 
definition of inheritance, is as follows. Let € be the parameter that takes the constant 
value 1 on all elements of Z. The parameter 1 on compositions is inherited from the 
(almost trivial) parameter € = 1 defined on summands. The ordinary MGF of (Z, €) 
is obviously 


C(z) 


I(z,u) = gu + 22u+ ete = — : 
—2 


Let C(z, u) be the BGF of (C, x). By Theorem III.1, the schemes translating admis- 
sible constructions in the univariate case carry over to the multivariate case, so that 

1 _ 1 _ l-z 
1-T(z,u) l-u-=e 1-2(ut1) 


1-—z 


(20) C(z,u) = 


Et voila! 
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Here is an alternative way of arriving at (20), which is important and is of much 
use in the sequel. One may regard the enumeration of compositions with respect to 
the number of summands as the enumeration of compositions with respect to both 
size (i.e., number of atoms) and number of marks, where each summand carries a 
mark, say ‘4’, which is an object of size 0. The number of marks is clearly inherited 
from summands to compositions. Then, one has an enriched specification, and its 
translation into MGFs, 


1 


Gly SE SEC (USEOs(Z)). ar OG iy 


as granted by Theorem III.1 and based on the correspondence: Z +> 2, > u. 
This notion of mark when used in conjunction with Theorem III.1 provides access to 
many joint parameters, as shown in Example 5 below. 


EXAMPLE III.5. Summands in integer compositions. Consider the double parameter x = 
(x1, X2) where y1 is the number of parts equal to 1 and x2 the number of parts equal to 2. One 
can write down an extended specification, with 1 a combinatorial mark for summands equal 
to 1 and 2 for summands equal to 2, 


(22) i 
= C(z, ut, So 
Cu) = ae er = 
where uj (7 = 1, 2) records the number of marks of type jj. 
Similarly, let 42 mark each summand and ju; mark summands equal to 1. Then, one has, 


1 
~ 1 (uaz + uz?(1 — z)-1) 


(23) C = SEQ (un Z-+uS#02(2)) => C(z,u1, u) : 
where wu keeps track of the total number of summands and wu, records the number of summands 
equal to 1. 

MGFs obtained in this way via the multivariate extension of the symbolic method can then 
provide explicit counts, after suitable series expansions. For instance, the number of composi- 
tions of n with k parts is, by (20), 


nk 1-2z _[n n—-1\  [n-1 
ele ame beh en, Reale 


a result otherwise obtained in Chapter I by direct combinatorial reasoning (the balls-and-bars 
model). The number of compositions of n containing & parts equal to 1 is obtained from the 
special case w2 = 1 in (22), 


nk 1 mes ier" (1 _ Fa ye 
l-uz- Cag a ene 


where the last OGF closely resembles a power of the OGF of Fibonacci numbers. 

Following the discussion of Section III.2, such MGFs also carry complete information on 
moments. In particular, the cumulated value of the number of parts in all compositions of n has 
OGF 
z(1—2z) 


OuC (Zz, U)|yor = (= 22)?” 
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10 20 30 40 


F.WADHO 2 VARHHS 


10 20 30 40 


FIGURE IJI.6. A random composition of nm = 100 represented as a ragged landscape 
(top); its associated profile 1°°2'*31°4'5'7'10!, defined as the partition obtained by sort- 
ing the summands (bottom). 


as seen from Section III. 2.1, since cumulated values are obtained via differentiation of a BGF. 
Therefore, the expected number of parts in a random composition of n is exactly (n > 1) 


1 n, (1-2) 1 

— ——_———>s == 1). 

goat Fae = ge +) 
A further differentiation will give access to the variance. The standard deviation is found to 
be tJ/n — 1, which is of an order (much) smaller than the mean. Thus, the distribution of the 


number of summands in a random composition satisfies the concentration property as n — oo. 


In the same vein, the number of parts equal to a fixed number r in compositions is deter- 
mined by 


C = SEQ (uz" _ S80,,(2)) = C(z,u)= (1 a (+ +(u— 1)2')) a 


It is then easy to pull out the expected number of r-summands in a random composition of 
size n. The differentiated form 


gives by partial fraction expansion 


9-7 2 9-7 de 7277 2 
————_—_. + ———— 
(1 — 2z)? 1—2z 
for a polynomial q(z) that we do not need to make explicit. Extracting the nth coefficient of the 


cumulative GF C’,(z, 1) and dividing by 2”~* yields the mean number of r-parts in a random 
composition. Another differentiation gives access to the second moment. One finds: 


OuC(z,U)| 4 = 


Proposition II.4 (Summands in integer compositions). The total number of summands in a 
random composition of size n has mean 3(n +1) and a distribution that is concentrated around 
the mean. The number of r summands in a composition of size n has mean 


n 
gai + O(1); 


and a standard deviation of order ./n, which also ensures concentration of distribution. 
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Clearly, suitable MGFs can keep track of any finite collection of summand types in compo- 
sitions, and the method is extremely general. Much use of this way of envisioning multivariate 
enumeration will be made throughout this book. ............... END OF EXAMPLE III.5. 


From the point of view of random structures, the example of summands shows 
that random compositions of large size tend to conform to a global “profile”. With 
high probability, a composition of size n should have about n/4 parts equal to 1, n/8 
parts equal to 2, and so on. Naturally, there are statistically unavoidable fluctuations, 
and for any finite n, the regularity of this law cannot be perfect: it tends to fade 
away especially as regards to largest summands that are log,(n) + O(1) with high 
probability. (In this region mean and standard deviation both become of the same order 
and are O(1), so that concentration no longer holds.) However, such observations do 
tell us a great deal about what a typical random composition must (probably) look 
like—it should conform to a “logarithmic profile’, 


yr/4 gn/8 3n/16 4nr/32 Sighs 


Here are for instance the profiles of two compositions of size n = 1024 drawn uni- 
formly at random: 


4250 9138 370 429 515 619 74 g0 gi 1253 9136 388 431 5138 68 73 gi gl 10? 
to be compared to the “ideal” profile 
1258 9128 364 4°22 516 68 7 82 9! 


It is a striking fact that samples of a very few elements or even just one element (this 
would be ridiculous by the usual standards of statistics) are often sufficient to illus- 
trate asymptotic properties of large random structures. The reason is once more to be 
attributed to concentration of distributions whose effect is manifest here. Profiles of a 
similar nature present themselves amongst objects defined by the sequence construc- 
tion, as we shall see throughout this book. (Establishing such general laws is often 
not difficult but it requires the full power of complex-analytic methods developed in 
Chapters [V—VIII.) 

> ILIl.6. Largest summands in compositions. For any « > 0, with probability tending to 1 as 
nm — ov, the largest summand in a random integer composition of size n is of size in the interval 


[(1 — €) logs n, (1 + €) logs n]. (Hint: use the first and second moment methods. More precise 
estimates are given in Chapter V.) dq 


In the sequel, it proves convenient to adopt a simplifying notation, much in the 
spirit of our basic convention, where the atom Z is systematically reflected by the 
name z of the variable in GFs. 


Simplified notation for marks. The same symbol (usually u,v, uy, U2...) 
is freely employed to designate a combinatorial mark (of size 0) and the 
corresponding marking variable in MGFs. 


For instance, we allow ourselves to write directly, for compositions, 


C = SEQ(u SEQs; Z)), C = SEQ(uu1Z + uSEQs2 Z)), 
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where u marks all summands and u; marks summands equal to 1, giving rise to (21) 
and (23). Note that the symbolic scheme of Theorem III.1 invariably applies to enu- 
meration according to the number of zero-size marks inserted into specifications. 


II. 3.3. Number of components in abstract unlabelled schemas. Consider a 
construction A = &(B), where the metasymbol & designates any standard unlabelled 
constructor amongst SEQ, MSET, PSET, Cyc. What is sought is the BGF A(z, u) of 
class A, with u marking each component. The specification is then of the form 


A= R(uB), K = SEQ, MSET, PSET, Cyc. 
Theorem III.1 applies and yields immediately the BGF A(z, u). In addition, differ- 
entiating with respect to u then setting u = 1 provides the GF of cumulated values 


(hence, in a non-normalized form, the OGF of the sequence of mean values of the 
number of components): 


In summary: 


Proposition III.5 (Components in unlabelled schemas). Given a construction, A = 
R(B), the BGF A(z, u) and the cumulated GF Q(z) associated to the number of com- 
ponents are given by the following table: 


R BGF (A(z, u)) Cumulative OGF (Q(z)) 
; I a? BAS B(z) 
aE —uBG)_ a IC ea TEST 
exp (Soa ») é 
PSET: oo | (N= A(z): S0(-1)°*B(z*) 
[[@s+uz") k=1 
exp (>: a2") 
MSET: ae, Nest A(z): ye B(z*) 
te uz”) Bn k=1 
= "Ol 1 = B(z* 
Mean values are then recovered with the usual formula, 
A, (# components) = ce 


A similar process applies to the number of components of a fixed size r in an A-object. 
> IIL.7. r-Components in abstract unlabelled schemas. Consider unlabelled structures. The 
BGF of the number of r-components in A = &{B} is given by 


A(z,u) = (1— B(z)—(u-1)Br2"), A(z, u) = A(z)- ( =r ) 3 


in the case of sequences (R = SEQ) and multisets (R = MSET), respectively. Similar formule 
hold for the other basic constructions and the cumulative GFs. J 
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20) 


0" 2 4 6 8 10 


FIGURE III.7. A random partition of size n = 100 has an aspect rather different from 
the profile of a random composition of the same size (Figure 6). 


> IIL.8. Number of distinct components in a multiset. The specification and the BGF are 


[[ @+useai(@)) = rT (1 on — 


BEB n>1 


as follows from first principles. <q 
As an illustration, we discuss the profile of random partitions (Figure 7). 
EXAMPLE II.6. The profile of partitions. Let P = MSET(Z) be the class of all integer 


partitions, where J = SEQ, (Z) represents integers in unary notation. The BGF of P with u 
marking the number x of parts (or summands) is obtained from the specification 


Ck lk 
Uw 
P=MSET(uZ) => P(z,u) =exp (>: = =) . 
Equivalently, from first principles, 


~T | = 
pe|[seou) = IIe 


n=1 


The OGF of cumulated values then results from the second form of the BGF by logarithmic 
differentiation: 


co 
gk 


(24) A(z) = P(z): > oe 


k=1 


Now, the factor on the right in (24) can be expanded as 


co k [oe} 
3 —-= ya 
=1 n= 


with d(n) the number of divisors of n. Thus, the mean value of x is 


n 


(25) En(x) = = >) d(i)Pr-j- 
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0 100 200 300 400 500 


FIGURE III.8. The number of parts in random partitions of size 1, ... , 500: exact values 
of the mean and simulations (circles, one for each value of 7n). 


The same technique applies to the number of parts equal to r. The form of the BGF is 


PB & SEQ(uT,) x [] StQTn) => Plz,u) = a - P(2), 
nZ£r 


which implies that the mean value of the number X of r-parts satisfies 


En() = 52" Ge =) eg ee ae een 


am “Toa By 
From these formule and a decent symbolic manipulation package, the means are calculated 
easily till values of n well in the range of several thousand. ..... END OF EXAMPLE III.6. 


The comparison between Figures 6 and 7 together with the supporting analysis 
shows that different combinatorial models may well lead to rather different types of 
probabilistic behaviours. Figure 8 displays the exact value of the mean number of parts 


in random partitions of size n = 1,...,500, (as calculated from (25)) accompanied 
=| 60 
708 
60F 50 
50 He 40 
40 I 46 
30 
20 
20 
10 10 


0 10 20 30 40 50 60 0 20 40 


FIGURE III.9. Two partitions of Pio00 drawn at random, compared to the limiting shape 
W(x) defined by (26). 
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with the observed values of one random sample for each value of n in the range. The 
mean number of parts is known to be asymptotic to 


Jnilogn 
m/2/3 ° 


and the distribution, though it admits a comparatively large standard deviation (O(./n)), 
is still concentrated in the technical sense. We shall prove some of these assertions in 
Chapter IX, p. 547 (see also [155]). 

In recent years, Vershik and his collaborators [118, 484] have shown that most in- 
teger partitions tend to conform to a definite profile given (after normalization by \/7) 
by the continuous plane curve y = W (2) defined implicitly by 


_ : -—ax —ay _ sep 
(26) y = V(z) iff e +e 1, a ve 
This is illustrated in Figure 9 by two randomly drawn elements of Pio99 represented 
together with the “most likely” limit shape. The theoretical result explains the huge 
differences that are manifest on simulations between integer compositions and integer 
partitions. 


The last example demonstrates the application of BGFs to estimates regarding 
the root degree of a tree drawn uniformly at random amongst the class G,, of general 
Catalan trees of size n. Tree parameters such as number of leaves and path length 
that are more global in nature and need a recursive definition will be discussed in 
Section IIT. 5 below. 

EXAMPLE III.7. Root degree in general Catalan trees. Consider the parameter x equal to 
the degree of the root in a tree, and take the class G of all plane unlabelled trees, i.e., general 


Catalan trees. The specification is obtained by first defining trees (G), then defining trees with a 
mark for subtrees(G°) dangling from the root: 


z 


G = Z x SEQ) a, | PO iee 


G° = Z x SEQ(uG) G(z, 4) = t= oO" 


This set of equations reveals that the probability that the root degree equals r is 


Pa{x =r} = eR)" =— fe ck ‘ ~ 


Gn n—-1 n—2 


this by Lagrange inversion and elementary asymptotics. Also, the cumulative GF is found to be 


zG(z) 
Q(z) = —— >. 
= 7 GeP 
The relation satisfied by G entails a further simplification, 


OGy= =G(2)° = (- zs 1) Coen 


z 
so that the mean root degree admits a closed form, 


: aA _ 
in(X) = G, (Gn+1 — Gn) cael 


a quantity clearly asymptotic to 3. 
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A random plane tree is thus usually composed of a small number of root subtrees, at least 
one of which should accordingly be fairly large. ............... END OF EXAMPLE III.7. 


Ill. 4. Inherited parameters and exponential MGFs 


The theory of inheritance developed in the last section applies almost verbatim to 
labelled objects. The only difference is that the variable marking size must carry a fac- 
torial coefficient dictated by the needs of relabellings. Once more, with a suitable use 
of multi-index conventions, the translation mechanisms developed in the univariate 
case (Chapter IT) remain in vigour, this in a way that parallels the unlabelled case. 


Let us consider a pair (A, x), where A is a labelled combinatorial class endowed 
with its size function | - | and y = (x1,.--, Xa) is a d-dimensional parameter. Like 
before, the parameter y is extended into X by inserting size as zeroth coordinate and 
a vector Z = (z0,..., 2a) of d+ 1 indeterminates is introduced, with zp marking size 
and z; marking ;. Once the multi-index convention of (19) defining z* has been 
brought into the game, the exponential MGF of (A, y) (see Definition III.4) can be 
rephrased as 


A(z) 
(27) ‘ 


T 
M 
hs 
E 
=| 


In a sense, this MGF is exponential in z (alias zo) but ordinary in the other variables; 
only the factorial ko! is needed to take into account relabelling induced by labelled 
products. 

We a priori restrict attention to parameters that do not depend on the absolute 
values of labels (but may well depend on the relative order of labels): a parameter is 
said to be compatible if, for any a, it assumes the same value on any labelled object a 
and all the order-consistent relabellings of a. A parameter is said to be inherited if it is 
compatible and it is defined by cases on disjoint unions and determined additively on 
labelled products—this is Definition III.5 with labelled products replacing cartesian 
products. In particular, for a compatible parameter, inheritance signifies additivity on 
components of labelled sequences, sets, and cycles. We can then cut-and-paste (with 
minor adjustments) the statement of Theorem III.1: 


Theorem III.2 (Inherited parameters and exponential MGFs). Let A be a labelled 
combinatorial class constructed from B,C, and let x be a parameter inherited from 
€ defined on B and (as the case may be) from ¢ on C. Then the translation rules 
of admissible constructions stated in Theorem II.1 apply. is used. The associated 
operators on exponential MGFs are: 
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Union: A=B+C = A(z) = B(z)+C(z) 
Product; A=BxC = A(z) = B(z)-C(z) 
1 
Sequence: A=SEQ(B) = A(z)= 


1 — B(z) 
Cycle: A=Cyc(B) = A(z) =log aw 
Set: A=SeET(B) => A(z) =exp(B(z)). 


PROOF. Disjoint unions are treated like in the unlabelled multivariate case. Labelled 
products result from 


X (a) E(B) 7C(y) 
Ime 0 Ee (9+ I zo) 
= 2 rar 2 sitll h 


acA BEB,yEC 


and the usual translation of binomial convolutions that reflect labellings by means of 
products of exponential generating functions (like in the univariate case detailed in 
Chapter IT). The translation for composite constructions is then immediate. 

This theorem can be exploited to determine moments, in a way that entirely par- 
allels its unlabelled counterpart. 


EXAMPLE III.8. The profile of permutations. Let P be the class of all permutations and y 
the number of components. Using the concept of marking, the specification and the exponential 
BGEF are 


P=set(uCyc(Z))  —>  P(z,u) =exp (uiog =) Pala 


as was already obtained by an ad hoc calculation in (5). We also know (page 149) that the mean 
number of cycles is the harmonic number H,, and that the distribution is concentrated since the 
standard deviation is much smaller than the mean. 

Regarding the number ¥ of cycles of length r, the specification and the exponential BGF 
are now 


P = SET (Cycz,(Z) + uCyc-,(Z)) 


1 on e(u-2"/r 
= P(z,u) =exp (ios ei +(u- =) == 


(28) 


The EGF of cumulated values is then 
i | 


(29) Q(z) = —F—. 


The result is a remarkably simple one: In a random permutation of size n, the mean number 
of r-cycles is equal to + foranyr <n. 

Thus, the profile of a random permutation, where profile is defined as the ordered sequence 
of cycle lengths, departs significantly from what has been encountered for integer compositions 
and partitions. Formula (29) also sheds a new light on the harmonic number formula for the 
mean number of cycles—each term + in the harmonic number expresses the mean number of r 
cycles. 

Since formule are so simple, one can get more information. By (28) one has, as seen 
above, 
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FIGURE IJI.10. The profile of permutations: a rendering of the cycle structure of six 
random permutations of size 500, where circle areas are drawn in proportion to cycle 
lengths. Permutations tend to have a few small cycles (of size O(1)), a few large ones (of 
size O(n)), and altogether have H, ~ log n cycles on average. 


where the last factor counts permutations without cycles of length r. From this (and the asymp- 
totics of generalized derangement numbers in Chapter IV), one proves easily that the asymptotic 
law of the number of r-cycles is Poisson? of rate +; in particular it is not concentrated. (This in- 
teresting property to be established in later chapters constitutes the starting point of an important 
study by Shepp and Lloyd [436].) 

Also, the mean number of cycles whose size is between n/2 and n is Hn —H)y/2) a 
quantity that equals the probability of existence of such a long cycle and is approximately 
log2 = 0.69314. In other words, we expect a random permutation of size n to have one 
or a few large cycles. (See the paper [436] for the original discussion of largest and smallest 
CV CLOS Px Gig f cians nace altiatatasgacyaai ia bahseales aoa a Gras tsar gciauhgranacenuarets END OF EXAMPLE IIL.8. 


> IIl.9. A hundred prisoners II. This is the solution to the prisoners’ problem of Note II.14, 
p. 114 The better strategy goes as follows. Each prisoner will first open the drawer which 
corresponds to his number. If his number is not there, he’!l use the number he just found to 
access another drawer, then find a number there that points him to a third drawer, and so on, 
hoping to return to his original drawer in at most 50 trials. (The last opened drawer will then 
contain his number.) This strategy succeeds provided the initial permutation o defined by o; 
being the number contained in drawer 7 has all its cycles of length at most 50. The probability 
of the event is 


ae. 50 100 

100 ; 

= fa aseee Soe ee e (e — = 0.31182 78206. 

p=(z Jex (F454 +=) 25 0.31182 78206 

Do the prisoners stand a chance against a malicious director who would not place the numbers 
in drawers at random? For instance, the director might organize the numbers in a cyclic per- 
mutation. [Hint: randomize the problem by renumbering the drawers according to a randomly 
chosen permutation. ] <q 


EXAMPLE III.9. Allocations, balls-in-bins models, and the Poisson law. Random allocations 
and the balls-in-bins model have been introduced in Chapter II in connection with the birthday 
paradox and the coupon collector problem. Under this model, there are n balls thrown into 


? The Poisson distribution of rate \ > 0 is supported by the nonnegative integers and determined by 
k 


r 
Zod 
P{k} =e ar 
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FIGURE IJI.11. Two random allocations with m = 12, n = 48. The rightmost dia- 
grams display the bins sorted by decreasing order of occupancy. 


m bins in all possible ways, the total number of allocations being thus m”. By the labelled 
construction of words, the bivariate EGF with z marking the number of balls and u marking the 
number x of bins that contain s balls (s a fixed parameter) is given by 


A = SEQ,, (SETzs(Z) + uSET=;(Z)) => A(z,u) = (< +(u- 5) 


In particular, the distribution of the number of empty bins (x) is expressible in terms of 
Stirling partition numbers: 


) = py se yee 4 _ (m—k)! fm n 
TNA SS ee pagel NS pr a gee ae 


By differentiation of the BGF, there results an exact expression for the mean (any s > 0): 


ay Aaatepa 3 (1-2) Bead wast) 


s} m ms 
Let m and n tend to infinity in such a way that > = isa fixed constant. This regime is ex- 
tremely important in many applications, some of which are listed below. The average proportion 
of bins containing s elements is LEm n(x), and from (30), one obtains by straightforward 
calculations the asymptotic limit estimate, 
1 SAO 
(31) lim — Em n(x") Spree. 


n/m=X, n>00 ™ : s! 


In other words, a Poisson formula describes the average proportion of bins of a given size in a 
large random allocation. (Equivalently, the occupancy of a random bin in a random allocation 
satisfies a Poisson law in the limit.) 

The variance of each y‘*) (with fixed s) is estimated similarly via a second derivative and 
one finds: 


s s-1 s s+l 
Vingn (x) ~ me*_ H(A), E(A) := G wet ar 23a ) ; 


See ees )ee = 
(s—1)! ( ®) s! 
As a consequence, one has the convergence in probability, 


Roy eae ese 
m sl? 


valid for any fixed § > O. 1... eee cee cee neces END OF EXAMPLE III.9. 


III. 4. INHERITED PARAMETERS AND EXPONENTIAL MGFS 167 


> W110. Hashing and random allocations. Random allocations of balls into bins are cen- 
tral in the understanding of a class of important algorithms of computer science known as 

hashing [208, 307, 433, 434, 486]: given a universe U/ of data, set up a function (called a hash- 
ing function) h : ¢ —> [1..mJ and arrange for an array of m bins; an element x € U is 
placed in bin number h(a). If the hash function scrambles the data in a way that is suitably 
(pseudo)uniform, then the process of hashing a file of n records (keys, data items) into m bins 
is adequately modelled by a random allocation scheme. If \ = 4, representing the “load”, is 
kept reasonably bounded (say, 4 < 10), the previous analysis implies that hashing allows for 
an almost direct access to data. <q 


Number of components in abstract labelled schemas. Like in the unlabelled uni- 

verse, a general formula gives the distribution of the number of components for the 
basic constructions. 
Proposition III.6. Consider labelled structures and the parameter x equal to the 
number of components in a construction A = R{B}, where Ris one of SEQ, SET CYC. 
The exponential BGF A(z,u) and the exponential GF Q(z) of cumulated values are 
given by the following table: 


R exp. MGF (A(z, u)) Cumul. EGF (Q(z)) 
1 2 -_ Bz 
(32) = 1— uB(z) i (i — Bia)? 
SET: exp (uB(z)) A(z)- B(z) = B(z)e? 
i B(z) 


Cyc: log 


1-uBlz 1— Bz) 
Mean values are then easily recovered, and one finds 
Q, 2 *O(z 
bal) = 20 = 71M) 
An — [2"|A(z) 
by the same formula as in the unlabelled case. 


> IL11. r-Components in abstract labelled schemas. The BGF A(z, u) and the cumulative 
EGF 22(z) are given by the following table, 


Sng: 1 1 Br2” 
1 — (B(z) + (u— 1) 227) (l— Biz)?! 
SET: exp (20) +(u-— (ye ) eP@). ee 
rl 
1 B,z" 
Cyc: lo : ; 
© — (BG) + (u— 1) 2) (-B(z)) 
in the labelled case. J 


EXAMPLE III.10. Set partitions. Set partitions S are sets of blocks, themselves nonempty sets 
of elements. The enumeration of set partitions according to the number of blocks is then given 
by 

S =SeEt(uSETs:1(Z)) => S(z,u) =e", 
Since set partitions are otherwise known to be enumerated by the Stirling partition numbers, 
one has the BGF and the vertical EGFs as a corollary, 


ML eZ u(e*—1) tN Dapogs See 
Efegeor, of}gedeon 
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which is consistent with earlier calculations of Chapter II. 
The EGF of cumulated values, ((z) is then 
O(z) = (e* —1e* ~', 


which is almost a derivative of S'(z): 


d 
Q(z) = ——S(z) — S(z) 
dz 
Thus, the mean number of blocks in a random partition of size n is 
Qn _ Smt. _ 1 
Sn Sin : 


a quantity directly expressible in terms of Bell numbers. A delicate computation based on 
the asymptotic expansion of the Bell numbers reveals that the expected value and the standard 
deviation are asymptotic to (Chapter VIII) 

n Jn 


log n’ logn’ 


respectively. Similarly the exponential BGF of the number of blocks of size k is 


S = SET(uSET=4(Z) +SET¢04(Z)) => = S(z,u) =e TE DeR Ie 


out of which mean and variance can be derived. .............. END OF EXAMPLE III. 10. 


EXAMPLEIII.11. Root degree in Cayley trees. Consider the class T of Cayley trees (nonplane 
labelled trees) and the parameter “root-degree”. The basic specifications are 


T = 2ZxSET(T) T(z) = ze? 
——. rr 
T° = ZxSET(uT) T(z,u) = ze™™@, 
The set construction reflects the non-planar character of Cayley trees and the specification T° is 


enriched by a mark associated to subtrees dangling from the root. Lagrange inversion provides 


the fraction of trees with root degree k, 
! on n—2-k -1 
1 n! (n — 1) agate k>1. 
(k—1)! (n-—1-—k)! net (k—1)! 


Similarly, the cumulative GF is found to be Q(z) = T(z)”, so that the mean root degree satisfies 


1 
Ez, (root degree) = 2 (1 - *) ~ 2. 

n 
Thus the law of root degree is asymptotically a Poisson law of rate 1 (shifted by 1). Probabilistic 
phenomena qualitatively similar to those encountered in plane trees are observed here as the 
mean root degree is asymptotic to a constant. However a Poisson law eventually reflecting the 
nonplanarity condition replaces the modified geometric law (known as a negative binomial law) 


present in plane trees. ..... 0.0... cece cee eee eee eee END OF EXAMPLE III.11. 


> I.12. Numbers of components in alignments. Alignments (Q) are sequences of cycles 
(Chapter II). The expected number of components in a random alignment of O,, is 


[2] log(1 — z)~1(1 — log(1 — z)~!)~? 
[z"](1 — log(1 — z)~*)~* 
Methods of Chapter V imply that the number of components in a random alignment has expec- 
tation ~ n/(e — 1) and standard deviation O(./n). dd 
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Unlabelled structures 


Integer partitions, MSETo SEQ Integer compositions, SEQ o SEQ 


y at 
ts 
ce 


Set partitions, SET o SET Surjections, SEQ o SET 
exp (u(e* — 1)) (1 — u (e* = hae 


n 
pea 2) 


Permutations, SETo CYC Alignments, SEQo CYC 
exp (ulog(1 — z)~*) (1 — wlog(1 — z)~")" 


~logn, ~ Vlogn 


FIGURE IIJ.12. Major properties of the number of components in six level-two struc- 
tures. For each class, from top to bottom: (i) specification type; (i¢) BGF; (i¢i) mean and 
variance of the number of components. 


> III.13. Image cardinality of a random surjection. The expected cardinality of the image of a 
random surjection in R, (see Chapter ID) is 
[z"]e*(2— 7)? 
eee). 
The number of values whose preimages have cardinality k is obtained by replacing the single 
exponential factor e* by z* /k!. Methods of Chapters IV and V imply that the image cardinality 
of a random surjection has expectation n/(2 log 2) and standard deviation O(./7). <q 


> IN.14. Distinct component sizes in set partitions. Take the number of distinct block sizes 
and cycle sizes in set partitions and permutations. The bivariate EGFs are 


[I (1-u+ ue"), TI (1-u4 wer"), 


n=1 n= 


as follows from first principles. dq 


Postscript: Towards a theory of schemas. Let us look back and recapitulate 
some of the information gathered in pages 156—169 regarding the number of compo- 
nents in composite structures. The classes considered in the table below are composi- 
tions of two constructions, either in the unlabelled or the labelled universe. Each entry 
contains the BGF for the number of components (e.g., cycles in permutations, parts 
in integer partitions, and so on), and the asymptotic orders of the mean and standard 
deviation of the number of components for objects of size n. 

Some obvious facts stand out from the data and call for explanation. First the 
outer construction appears to play the essential réle: outer sequence constructs (cf 
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integer compositions, surjections and alignments) tend to dictate a number of compo- 
nents that is O(n) on average, while outer set constructs (cf integer partitions, set par- 
titions, and permutations) are associated with a greater variety of asymptotic regimes. 
Eventually, such facts can be organized into broad analytic schemas, as will be seen 
in Chapters [V-IX. 


> III.15. Singularity and probability. The differences in behaviour are to be assigned to the 
rather different types of singularity involved: on the one hand sets corresponding algebraically 
to an exp(-) operator induce an exponential blow up of singularities; on the other hand se- 
quences expressed algebraically by quasi-inverses (1 — jot are likely to induce polar singu- 
larities. Recursive structures like trees lead to yet other types of phenomena with a number of 
components, i.e., the root degree, that is bounded in probability. <q 


II.5. Recursive parameters 


In this section, we adapt the general methodology of previous sections in order to 
treat parameters that are defined by recursive rules over structures that are themselves 
recursively specified. Typical applications concern trees and tree-like structures. 


Regarding the number of leaves, or more generally, the number of nodes of some 
fixed degree, in a tree, the method of placing marks applies like in the non-recursive 
case. It suffices to distinguish elements of interest and mark them by an auxiliary 
variable. For instance, in order to mark composite objects made of r components, 
where r is an integer and & designates any of SEQ, SET (or MSET, PSET), CYC, one 
should split a construction A(C) according to the identity 


R(C) = Rar (C) + Ryr(C), 


then introduce a mark (wu) in front of the first term of the sum. This technique gives 
rise to specifications decorated by marks to which Theorems III.1 and III.2 apply. For 
a recursively defined structure, the outcome is a functional equation defining the BGF 
recursively. This technique is illustrated by Examples 12 and 13 below in the case of 
Catalan trees and the parameter number of leaves. 


EXAMPLE III.12. Leaves in general Catalan trees. | How many leaves does a random 
tree of some variety have? Can different varieties of trees be somehow distinguished by the 
proportion of their leaves? Beyond the botany of combinatorics, such considerations are for 
instance relevant to the analysis of algorithms since tree leaves, having no descendants, can be 
stored more economically; see [306, Sec. 2.3] for an algorithmic motivation for such questions. 

Consider once more the class G of plane unlabelled trees, G = Z x SEQ(G), enumerated 


by the Catalan numbers: G, = 4 ecarae The class G° where each leaf is marked is 
G° = Zu+ Z x SEQs1(G°) => Say see oe 
ae 1— G(z,u) 


The induced quadratic equation can be solved explicitly 


Ca 5 (1+ (w-1e2- VI= 2st et we). 
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It is however simpler to expand using the Lagrange inversion theorem which provides 


ome = Wh) (e"6@u) = w' (Zu] (v4) ) 


¥ 
— Lfn\, n-a gee _1f{n\[n-2 
= lel! l@omee ala Roa: 


These numbers are known as Narayana numbers, see EJS A001263, and they surface repeatedly 
in connexion with ballot problems. The mean number of leaves derives from the cumulative 
GF, which is 

1 1 z, 

2 Pee 

so that the mean is n/2 exactly for n > 2. The distribution is concentrated since the standard 
deviation is easily calculated to be O(.\/7). ........ eee eee eee END OF EXAMPLE III.12. 


Q(z) = OuG(z,u)|,-4 = 


EXAMPLE III.13. Leaves and node types in binary trees. The class B of binary plane trees, 


also enumerated by Catalan numbers (B, = = ea) can be specified as 


(33) B=Z+(Bx Z)+(Zx B)+(Bx Zx B), 
which stresses the distinction between four types of nodes: leaves, left branching, right branch- 
ing, and binary. Let wo, u1, u2 be variables that mark nodes of degree 0,1,2, respectively. Then 
the root decomposition (33) provides for the MGF B = B(z, uo, u1, u2) the functional equa- 
tion 

B=zu9 + 2z2u,B+ zu2B", 


which, by Lagrange inversion, gives 


ae n 
Bn kooks ska = « ki \) 


subject to the natural conditions: ko + ki + ke = n and ko = ko +1. Specializations and 
moments can be easily calculated from such an approach [404]. In particular, the mean number 
of nodes of each type is asymptotically: 


leaves: ~ ve l-nodes: ~ > 2-nodes: ~ z 


There is an equal asymptotic proportion of leaves, double nodes, left branching, and right 
branching nodes. Also, the standard deviation is in each case O(,/7), so that each of the 
corresponding distributions is concentrated. .................. END OF EXAMPLE III.13. 


> IIL16. Leaves and node-degree profile in Cayley trees. For Cayley trees, the bivariate EGF 
with u marking the number of leaves is the solution to 


T(z,u) = uz + 2(e7*™ — 1). 


The distribution is expressed in terms of Stirling partition numbers. The mean number of leaves 
in a random Cayley tree is asymptotic to ne~'. 
More generally, the mean number of nodes of outdegree k in a random Cayley tree of 
size n is asymptotic to 
kV 
Degrees of nodes are thus approximately given by a Poisson law of rate 1. <q 


n-e 
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> W1.17. Node-degree profile in simple varieties of trees. For a family of trees generated 
by T(z) = z(T(z)) with ¢ a power series, the BGF of the number of nodes of degree k 
satisfies 


T(z,u) =z (o(T, u)) + de(u — T(z, u)*) 


where $; = [u*]@(u). The cumulative GF is 


brT (z)* 2 k-1qy 
Q(z) = z————~ = T T 
from which expectations can be determined. <q 


> III.18. Marking in functional graphs. Consider the class F of finite mappings discussed in 
Chapter IT: 


F = SET(K), K =Cyc(T), T = ZxSET(T). 
The translation into EGFs is 
1 
1—T(z)’ 
Here are bivariate EGFs for (i) the number of components, (i) the number of maximal trees, 
(zit) the number of leaves: 


FZ) Se, K(z) = log T(z) = ze". 


(2) ere), (it) T—uT@)’ 


(Cz) oo with T(z,u) = (u—1)z+ze 


The trivariate EGF F'(u1,u2,z) of functional graphs with wi marking components and u2 
marking trees is 


T(z,u) 


1 
=, ran IN 
F'(z,u1, u2) = exp(ui log(1 — u2T(z)) ~) = (lm) 
An explicit expression for the coefficients involves the Stirling cycle numbers. J 


We shall stop here these examples that could be multiplied ad libitum since such 
calculations greatly simplify when interpreted in the light of asymptotic analysis. The 
phenomena observed asymptotically are, for good reasons, especially close to what 
the classical theory of branching processes provides (see the book by Harris [262]). 


Linear transformations on parameters and path length in trees. We have so far 
been dealing with a parameter defined directly by recursion. Next, we turn to other pa- 
rameters such as path length. As a preamble, one needs a simple linear transformation 
on combinatorial parameters. Let A be a class equipped with two scalar parameters, 
x and €, related by 


x(a) = lal + &(a). 
Then, the combinatorial form of BGFs yields 
me zlelyx(e) — ys glolalelté(e) — S> (zu)llus, 
acA acA acA 
that is, 
(34) Ay (z,u) = Ae(zu, w). 


This is clearly a general mechanism: 
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Linear transformations and MGFs: A linear transformation on param- 
eters induces a monomial substitution on the corresponding marking vari- 
ables in MGFs. 


We now put this mechanism to use in the recursive analysis of path length in trees. 


EXAMPLE III.14. Path length in trees. The path length of a tree is defined as the sum of 
distances of all nodes to the root of the tree, where distances are measured by the number 
of edges on the minimal connecting path of a node to the root. Path length is an important 
characteristic of trees. For instance, when a tree is used as a data structure with nodes containing 
additional information, path length represents the total cost of accessing all data items when a 
search is started from the root. For this reason, path length surfaces, under various models, in 
the analysis of algorithms like algorithms and data structures for searching and sorting (e.g., 
tree-sort, quicksort, radix-sort); see [306, 434]. 
The definition of path length as 


X(T) = So dist(v, root(T)), 


veT 


transforms into an inductive definition: 


(35) Ar) = oe (A(v) + Jv). 

v root subtree of 7 
To establish this identity, distribute nodes in their corresponding subtrees; correct distances to 
the subtree roots by 1, and regroup terms. 

From this point on, we specialize the discussion to general Catalan trees (see Note 19 for 
other cases): G = Z x SEQ(G). Introduce momentarily the parameter (7) = |r| + A(z). 
Then, one has from the inductive definition (35) and the general transformation rule (34): 

z 


(36) G)(z,u) = T—Gulz,u) and Gu(z,u) = G)(zu, wu). 
KL ’ 


In other words, G(z, u) = G)(z, w) satisfies a nonlinear functional equation of the difference 


type: 
z 


1— G(uz,u) 

(This functional equation will be encountered again in connection with area under Dyck paths: 
see Chapter V, p. 307.) The generating function Q(z) of cumulated values of then obtains 
by differentiation with respect to u upon setting u = 1. We find in this way that Q(z) := 
OuG(z, u)|,,_4 Satisfies 


G(z,u) = 


env 
(1 — G(z))? 
which is a linear equation that solves to 
G'(z) z z 


MG) =z (i—-G@))?-z 21-42) 2/1—42 


Consequently, one has (n > 1) 
Q = g2n—3 _ 1/2n—2 
<- 2\n-1]? 


where the sequence starting 1, 5, 22, 93, 386 for n > 2 constitutes EJS A000346. By an 
elementary asymptotic analysis, we get: 


Q(z) = 2G" (z) + Q(z)) , 
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FIGURE III.13. A random pruned binary tree of size 256 and its associated level profile: 
the histogram on the left displays the number of nodes at each level in the tree. 


The mean path length of a random Catalan tree of size n is asymptotic to 
iV mn3; in short: a branch from the root to a random node in a random 
Catalan tree of size n has expected length of the order of \/n. 


Random Catalan trees thus tend to be somewhat imbalanced—by comparison, a fully balanced 
binary tree has all paths of length at most log, + O(1). ..... END OF EXAMPLE III. 14. 


The imbalance in random Catalan trees is a general phenomenon—it holds for 
binary Catalan and more generally for all simple varieties of trees. Note 19 below and 
Example VII.9 (p. 442) imply that path length is invariably of order n/n on average 
in such cases. Height is of typical order \/n as shown by Rényi and Szekeres [409], de 
Bruijn, Knuth and Rice [113], Kolchin [314], as well as Flajolet and Odlyzko [197]. 
Figure 13 borrowed from [434] illustrates this on a simulation. (The contour of the 
histogram of nodes by levels, once normalized, has been proved to converge to the 
process known as Brownian excursion.) 
> IIL.19. Path length in simple varieties of trees. The BGF of path length in a variety of trees 
generated by T(z) = z@(T'(z)) satisfies 

T(z,u) = 26(T (zu, u)). 


In particular, the cumulative GF is 
Q(z) = Ou (T(2,4)) yar = AT @y FO)” 


from which coefficients can be extracted. J 


Ill. 6. Complete generating functions and discrete models 


By acomplete generating function, we mean, loosely speaking, a generating func- 
tion in a (possibly large, and even infinite in the limit) number of variables that mark a 
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homogeneous collection of characteristics of a combinatorial class*. For instance one 
may be interested in the joint distribution of all the different letters composing words, 
the number of cycles of all lengths in permutations, and so on. A complete MGF 
naturally entails very detailed knowledge on the enumerative properties of structures 
to which it is relative. Complete generating functions, given their expressive power, 
also make weighted models accessible to calculation, a situation that covers in partic- 
ular Bernoulli trials (p. 179) and branching processes from classical probability theory 
(p. 185). 


Complete GFs for words. As a basic example, consider the class of all words 
W = SEQ{A} over some finite alphabet A = {a1,...,a,}. Let x = (y1,---,Xr)s 
where x;(w) is the number of occurrences of the letter a; in word w. The MGF of A 
with respect to x is 


A = uy ay + Ugdg +++ Upay = A(z,u) = zu1 + zug +--+ 4+ 2u,, 
and x on W is clearly inherited from y on A. Thus, by the sequence rule, one has 
1 
which describes all words according to their compositions into letters. In particular, 


the number of words with n,; occurrences of letter a; and n = )° n, is in this frame- 
work obtained as 


n n! 
ni, 2 n 
[wt uy? up] (ur + ug +--+ Ur) =( )-—= 
11,22,--+5Mr Ny Ng Mp 


(37) W = SEQ(A) = W(z,u) = 


We are back to the usual multinomial coefficients. 


> I1.20. After Bhaskara Acharya (circa 1150AD). Consider all the numbers formed in decimal 
with digit 1 used once, with digit 2 used twice,..., with digit 9 used nine times. Such numbers 
all have 45 digits. Compute their sum S and discover, much to your amazement that S equals 


458755596000061532190847692863999999999999999 541244403999938467809 152307 13600000. 


This number has a long run of nines (and further nines are hidden!). Is there a simple explana- 
tion? This exercise is inspired by the Indian mathematician Bhaskara Acharya who discovered 
multinomial coefficients near 1150AD; see [306, p. 23-24] for a brief historical note. <J 


Complete GFs for permutations and set partitions. Consider permutations and 
the various lengths of their cycles. The MGF where u; marks cycles of length & for 
k =1,2,...can be written as an MGF in infinitely many variables: 


z a x 
(38) P(z,u) = exp (ui-— + u2e— +ugs—+--- }. 

1 2 3 
This MGF expression has the neat feature that, upon specializing all but a finite num- 
ber of u; to 1, we derive all the particular cases of interest with respect to any finite 
collection of cycles lengths. Observe also that one can calculate in the usual way any 
coefficient [z”]P as it only involves the variables u1,..., Un. 


3Complete GFs are not new objects. They are simply an avatar of multivariate GFs. Thus the term is 
only meant to be suggestive of a particular usage of MGFs, and essentially no new theory is needed in order 
to cope with them. 
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> W.21. The theory of formal power series in infinitely many variables. (This note is for 
formalists.) Mathematically, an object like P in (38) is perfectly well defined. Let U = 
{u1, u2,...} be an infinite collection of indeterminates. First, the ring of polynomials R = 
C[U] is well defined and a given element of R involves only finitely many indeterminates. 
Then, from R, one can define the ring of formal power series in z, namely R[z]. (Note that, 
if f € R[z], then each [z”]f involves only finitely many of the variables u;.) The basic op- 
erations and the notion of convergence, as described in APPENDIX A: Formal power series, 
p. 676, apply in a standard way. 

For instance, in the case of (38), the complete GF P(z, u) is obtainable as the formal limit 


. z ok gktl 
P(z,u) = lim exp (mete tu al +) 


in R[z] equipped with the formal topology. (In contrast, the quantity evocative of a generating 
function of words over an infinite alphabet 


& -1 
j=l 


cannot receive a sound definition as a element of the formal domain R[z].) <J 


Henceforth, we shall keep in mind that verifications of formal correctness regard- 
ing power series in infinitely many indeterminates are always possible by returning to 
basic definitions. 

Complete generating functions are often surprisingly simple to expand. For in- 
stance, the equivalent form of (38) 


P(z,u) = et 7/1. et277/2, eus2?/3 


implies immediately that the number of permutations with k, cycles of size 1, kz of 
size 2, and so on, 1s 


ni 


(39) Pe er 


provided > jk; = n. This is a result originally due to Cauchy. Similarly, the EGF of 
set partitions with uw; marking the number of blocks of size 7 is 


5 7 z ge oe 
(z,u) = exp igi t+ tea tusay to : 
A formula analogous to (39) follows: the number of partitions with k, blocks of size 
1, ke of size 2, and so on, is 
n! 

kyl ko! +++ kp! 1h Q)k2 «thn ” 
Several examples of such complete generating functions are presented in Comtet’s 
book; see [98], pages 225 and 233. 


> WII.22. Complete GFs for compositions and surjections. |The complete GFs of integer 
compositions and surjections with w; marking the number of components of size j are 


1 1 
Le 1-2 ws 


j 
j! 
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The associated counts with n = ee jk; are given by 


ki tko+-:: n! ki tka+-:-: 
ki, ka,... : [ki Qtko... ki, ka,... i 
These factored forms derive directly from the multinomial expansion. The symbolic form of 
the multinomial expansion of powers of a generating function is sometimes expressed in terms 
of Bell polynomials, themselves nothing but a rephrasing of the multinomial expansion; see 
Comtet’s book [98, Sec. 3.3] for a fair treatment of such polynomials. dq 
> II1.23. Fad di Bruno’s formula. The formule for the successive derivatives of a functional 
composition h(z) = f(g(z)) 
aW(z) = f'(g(2))9'@), OBh(z) = F"(G2))o'(2)? + F@)g"(y- + 

are clearly equivalent to the expansion of a formal power series composition. Indeed, assume 
without loss of generality that z = 0 and g(0) = 0; set fr := OF f(0), and similarly for g, h. 


Then: 4 f : 
= z k g2 2 
h(z)= . hn = rs = (nz + Tied +. -) : 


Thus in one direct application of the sniseeniat expansion, one finds 


-DEY (aaa) AGB" 


where the summation cache Cis: 1€; +24. +:---+klp =n, 0, + lo +--- +l, =k. 
This shallow identity is known as Faa di Bruno’s formula [98, p. 137]. (Faa di Bruno (1825— 
1888) was canonized by the Catholic Church in 1988, presumably for reasons not related to his 
formula.) J 


> III.24. Relations between symmetric functions. Symmetric functions may be manipulated 
by mechanisms that are often reminiscent of the set and multiset construction. They appear 
in many areas of combinatorial enumeration. Let X = {2;}j_, be a collection of formal 
variables. Define the symmetric functions 


1 n 
see) = Dee ’ ieee rere 


t 


The an, bn, Cn, called resp. elementary, monomial, and power symmetric functions are express- 
ible as 


io 

" 

n= y Pi, Big By be y Li, Lin ++ Lip; n> Gi 
i=1 


iy Sigs <i>, ty Sig Se <Sip 


The following relations hold for the OGFs A(z), B(z),C(z) of an, bn, Cnt! 


1 1 
B = —— A = 
@) = 405 @ = aE 
d dt 
C(z) = z—log B(z), Biz) = exp] C(t)— 
dz 0 t? 
Consequently, each of an, bn, Cn is polynomially expressible in terms of any of the other quan- 
tities. (The connection coefficients, like in Note 23, involve multinomials.) J 


> III.25. Regular graphs. A graph is r—regular iff each node has degree exactly equal to r. 
The number of r—-regular graphs of size n is 


[wiay---en] [J] (+a). 
1<i<j<n 


[Gessel [234] has shown how to extract explicit expressions from such huge symmetric func- 


tions.] <q 
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III. 6.1. Word models. The enumeration of words constitutes a rich chapter of 
combinatorial analysis, and complete GFs serve to generalize many results to the case 
of nonuniform letter probabilities, like the coupon collector problem and the birthday 
paradox considered in Chapter II. Applications are to be found in classical probability 
theory and statistics [108] (the so-called Bernoulli trial models), as well as in computer 
science [458] and mathematical models of biology [491]. 


EXAMPLE III.15. Words and records. Fix an alphabet A = {a1,...,ar} and let W = 
SEQ{ A} be the class of all words over A, where A is naturally ordered by a1 < az <--+ < Gr. 
Given a word w = w1--- Wn, a (strict) record is an element w, that is larger than all preceding 
elements: w; > w; for all i < 7. (Refer to Figure 13 of Chapter II for a graphical rendering of 
records in the case of permutations.) 

Consider first the subset of WV comprising all words that have the letters a;,,...,@;, aS 
successive records, where i; < --- < ix. The symbolic description of this set is in the form of 
a product of & terms 


(40) ( SEQ(ai +--+ + a.) tee (a SEQ(a1 +--+ + ai, )). 


Consider now MGFs of words where z marks length, v marks the number of records, and each 
u; marks the number of occurrences of letter a;. The MGF associated to the subset described 
in (40) is then 


(crus — 2(ui te + u))*) vee (<u. —2(uit-+++ u,)). 


Summing over all values of k and of 71 <--- < ix gives 
(41) W(z,v, u) = Il (1 + zutts (1 — z(ua +--+ + tts))~*) : 
s=1 


the rationale being that, for arbitrary quantities y;, one has by distributivity: 
*: a, Yis Yio ** Vix =[[G+y,). 


k=0 1<11 <---<ip<r s=1 


We shall encounter more applications of (41) below. For the time being let us simply 
examine the mean number of records in a word of length n over the alphabet A, when all such 
words are taken equally likely. One should set u; ++ 1 (the composition into specific letters is 
forgotten), so that W assumes the simpler form 


W(z,0) = Ul (1+ =) 


j=1 


Logarithmic differentiation then gives access to the generating function of cumulated values, 


r 


Z 1 
Gee aes 


v=1 j=l 


Thus, by partial fraction expansion, the mean number of records in W,, (whose cardinality 
is r”) has the exact value 


(42) Ew, (# records) = H, — 
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There appears the harmonic number H,, like in the permutation case, but now with a negative 
correction term which, for fixed r, vanishes exponentially fast with n (this betrays the fact that 
some letters from the alphabet might be missing). ............ END OF EXAMPLE III.15. 


EXAMPLE III.16. Weighted word models and Bernoulli trials. Let A = {a1,...,ar} be an 
alphabet of cardinality r, and let A = {Ai,...,A,r} be a system of numbers called weights, 
where weight A; is viewed as attached to letter a;. Weights may be extended from letters to 
words multiplicatively by defining the weight 7(w) of word w as 

(w) — Yi Aig Xi if W = Qi, Gig *** Gi 


n n 


Tr 


= Tp, 


j=l 


where x;(w) is the number of occurrences of letter a; in w. Finally, the weight of a set is by 
definition the swm of the weights of its elements. 

Combinatorially, weights of sets are immediately obtained once the corresponding gener- 
ating function is known. Indeed, let S C W = SEQ{.A} have complete GF 


S(z,u1,---,Ur) = y zltel Xt). gyxer(w) 
wes 


where y;(w) is the number of occurrences of letter a; in w. Then one has 


Sera DS ate); 


wes 


so that extracting the coefficient of z” gives the total weight of S, = SMW, under the weight 
system A. In other words, the GF of a weighted set is obtained by substitution of the numerical 
values of the weights inside the associated complete MGF- 

In probability theory, Bernoulli trials refer to sequences of independent draws from a fixed 
distribution with finitely many possible values. One may think of the succession of flippings of 
a coin or castings of a die. If any trial has r possible outcomes, then the various possibilities 
can be described by letters of the r-ary alphabet A. If the probability of the jth outcome is 
taken to be Aj, then the A-weighted models on words becomes the usual probabilistic model 
of independent trials. (In this situation, the \,;’s are often written as p;’s.) Observe that, in the 
probabilistic situation, one must have A; + --- + A; = 1 with each A; satisfying 0 < A; < 1. 
The equiprobable case, where each outcome has probability 1/r can be obtained by setting 
Aj = 1/r and it then becomes equivalent to the usual enumerative model. In terms of GFs, 
the coefficient [z”]S(z, A1,..., Ar) then represents the probability that a random word of W,, 
belongs to S. Multivariate generating functions and cumulative generating functions then obey 
properties similar to their usual (ordinary, exponential) counterparts. 

As an illustration, assume one has a biased coin with probability p for heads (H) and q = 
1—>p for tails (T’). Consider the event: “in n tosses of the coin, there never appear £ contiguous 
heads”. The alphabet is. A = {H,T}. The construction describing the events of interest is, as 
seen in Chapter I, 


S = SEQ. ;{H} SEQ{T SEQ.,{ H}}. 


Its GF with u marking heads and v marking tails is then 
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Thus, the probability of the absence of ¢-runs amongst a sequence of n random coin tosses is 
obtained after the substitution u — p, v — qin the MGF, 


2") 1— pte 
1l—2z+qp*2ét!’ 

leading to an expression which is amenable to numerical or asymptotic analysis. Feller’s 

book [162, p. 322-326] offers for instance a classical discussion of the problem. END OF EXAMPLE III.16. 


EXAMPLE III.17. Records in Bernoulli trials. To conclude the discussion of probabilistic 
models on words, we come back to the analysis of records. Assume now that the alphabet 
A = {a1,...,a,} has in all generality the probability p; associated with the letter a;. The 
mean number of records is analysed by a process entirely parallel to the derivation of (42): one 
finds by logarithmic differentiation of (41) 


r 


ji z Pi 
43 E # records) = Q(z where Q(z) = aE 
(43) Ewa( pale Je) @) ie Tae 


The cumulative GF Q(z) in (43) has simple poles at the points 1,1/P,—1, 1/P,-—2, and so on, 
where P; = pi +---+ ps. For asymptotic purposes, only the dominant poles at z = 1 counts 
(see Chapter IV for a systematic discussion), near which 


r 


Consequently, one has an elegant asymptotic formula generalizing the case of permutations that 
has a harmonic mean (10): 

The mean number of records in a random word of length n with nonuni- 

form letter probabilities p; satisfies asymptotically (n — +00) 


rf 


Dj 
Ew,, (# records) ~ 4. 
25, + Pj+i tes + Pr 
This relation and similar ones were obtained by Burge [74]; analogous ideas may serve to ana- 
lyse the sorting algorithm Quicksort under equal keys [432] as well as the hybrid data structures 
of Bentley and Sedgewick; see [38, 93]. .................000. END OF EXAMPLE III.17. 


Coupon collector problem and birthday paradox. Similar considerations apply 
to weighted EGFs of words, as considered in Chapter II. For instance, the probability 
of having attained a complete coupon collection at time n in case a company issues 
coupon j with probability p;, for 1 < 7 <r, is (coupon collector problem, Chapter IT) 


P(C <n) =nl[z"] [] (e?7 - 1). 
j=l 
The probability that all coupons are different at time n is (birthday paradox, Chap- 
ter IT) 


r 


P(B > n) = nl[2"] |] +232), 


j=l 
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which corresponds to the birthday problem in the case of nonuniform mating periods. 


Integral representations comparable to the ones of Chapter II are also available: 


Tr 


(C) = [ 1— [[a — e Pit) dt, 1(B) = is II (1 + p;t) e-¢ dt. 


j=1 


See the study by Flajolet, Gardy, and Thimonier [181] for several variations on this 
theme. 


> III.26. Birthday paradox with leap years. Assume that the 29th of February exists precisely 
once every fourth year. Estimate the effect on the expectation of the first birthday collision. < 


EXAMPLE III.18. Rises in Bernoulli trials: Simon Newcomb’s problem. Simon Newcomb 
(1835-1909), otherwise famous for his astronomical work, was reportedly fond of playing the 
following patience game: one draws from a deck of 52 playing cards, stacking them in piles in 
such a way that one new pile is started each time a card appears whose number is smaller than 
its predecessor. What is the probability of obtaining t piles? A solution to this famous problem 
is found in MacMahon’s book [350] and a concise account by Andrews appears in [10, §4.4]. 

Simon Newcomb’s problem can be rephrased in terms of rises. Given a word w = 
W1 +++ Wn over the alphabet A ordered by ai < a2 < ---, a weak rise is a position 7 < n 
such that w; < w 41. (The numbers of piles in Newcomb’s problem is the number of cards 
minus | minus the number of rises.) Let W(z,v, u) be the MGF of all words where z marks 
length, v marks the number of weak rises, and wu; marks the number of occurrences of letter 7. 
Set z; = zu, and let W;(z,v, u) be the MGF relative to those nonempty words that start with 
letter a;, so that 


W=1+(Wi+---+W,). 


The W; satisfy the set of equations (j = 1,...,1r), 


(44) W; = 2) +2; (Wit---+W3-1) + v2; (Wj +---+W,), 


as seen by considering the first letter of each word. The linear system (44) is easily solved upon 
setting W; = z;X,. Indeed, by differencing, one finds that 


(45) Xjzi — Xj = 2jZXj(L—-v), Xj41 = Xj(1+2;(1—- v)), 


In this way, each X,; can be determined in terms of X;. Then transporting the resulting expres- 
sions into the relation (44) instantiated at 7 = 1, and solving for X1 leads to an expression for 
Xj, hence for all the X, and finally for W itself: 


(46) Ww = ——— P= ican cen 


j=1 


Goulden and Jackson provide a similar looking expressions in [244] (pp. 72 and 236). 

The result of (46) gives access to moments (e.g., mean and variance) of the number of 
rises in a Bernoulli sequence as well as to counting results, once coefficients of the MGF are 
extracted. (See also [234, 244] for some of the possible tools from the theory of symmetric 
functions.) The OGF (46) can alternatively be derived by an inclusion-exclusion argument: 
refer to the particular case of rises in permutations and Eulerian numbers which is discussed 
DElOWa Mini k eter dhe de dine eb bad Mate Mala teh eanee. END OF EXAMPLE III.18. 
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> IMl.27. The final solution to Simon Newcomb’s problem. Consider a deck of cards with a 
suits and r distinct card values. Set N = ra. (The original problem has r = 13, a = 4, 
N = 52.) One has from (46): W = (v — 1)P/(1 — vP). The expansion of (1 — y)~? and the 
collection of coefficients yields 


[eo +++ 22]W = (1—v) Sout. 28] PF = (1-0) > (‘) ur. 


k>1 k>1 
t+1 r 
E N+1 k 
so that [zf --- 2¢u"]W = y (iy ( ( “ <J 
ra t+1-k]\a 


Ill. 6.2. Tree models. We examine here two important GFs associated with tree 
models; these provide valuable information concerning the degree profile and the level 
profile of trees, while being tightly coupled with an important class of stochastic pro- 
cesses, namely the branching processes. 

The major classes of trees that we have encountered so far are the unlabelled 
plane trees and the labelled nonplane trees, prototypes being the general Catalan trees 
(Chapter I) and the Cayley trees (Chapter IT). In both cases, the counting generating 
functions satisfy a relation of the form 


(47) Y(z) = 2(¥(z)), 


where the GF is either ordinary (plane unlabelled trees) or exponential (nonplane la- 
belled trees). Corresponding respectively to the two cases, the function ¢ is deter- 
mined by 


wr 
(48) dw) = Sou", ow) = >, 
wEN weQ 

where 2 C N is the set of allowed node degrees. Meir and Moon in an important pa- 
per [356] have described some common properties of tree families that are determined 
by the Axiom (47). (For instance mean path length is invariably of order n/n, see 
Chapter VII, and height is O(,/7).) Following these authors, we call simple variety of 
trees any class whose counting GF is defined by an equation of type (47). For each 
of the two cases of (48), we write 


(49) $(w) = J > oyu! 
j=0 


Degree profile of trees. First we examine the degree profile of trees. Such a 
profile is determined by the collection of parameters y;, where x;(7) is the number 
of nodes of outdegree j in r. The variable u; will be used to mark x,, that is, nodes of 
outdegree j. The discussion already conducted regarding recursive parameters shows 
that the GF Y (z, u) satisfies the equation 


Y(z,u) = z®(Y(z, u)) where ®(w) = uodo + uidiw + urdow? +---. 


Formal Lagrange inversion can then be applied to Y (z, u), to the effect that its coeffi- 
cients are given by the coefficients of the powers of ®. 
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Proposition II.7 (Degree profile of trees). The number of trees of size n and degree 
profile (no, 1, 2,...) ina simple variety of trees defined by the “generator” (49) is 


1 nr nr n nm 
(50) Veneto = wn = ( ) a 1 Og ts 


TM \No, 71, 712,--. 


There, w,, = 1 in the unlabelled case, whereas wy, = n! in the labelled case. The 
values of the n; are assumed to satisfy the two consistency conditions: a nj =n 
and \),jnj=n—1. 


PROOF. The consistency conditions translate the fact that the total number of nodes 
should be n while the total number of edges should equal n — 1 (each node of degree 7 
is the originator of 7 edges). The result follows from Lagrange inversion 


1 
Ynjnoniynay.. = Wn * [ug uy us? oan] (= lu""J0(uy") , 


to which a standard multinomial expansion applies, yielding (50). 
For instance, for general Catalan trees (¢; = 1) and for Cayley trees (¢; = 1/7!) 
these formule become 


1 n (n—1)! n 
= and ——______ : 
n\no,N1,N2,.-- Olro] ri 2ir2..- \no, m1, N2,--. 


The proof above also reveals the logical equivalence between the general tree 
counting result of Proposition III.7 and the most general case of Lagrange inversion. 
(This results from the fact that ® can be specialized to any particular series.) Put 
otherwise, any direct proof of (50) provides a combinatorial proof of the Lagrange 
inversion theorem. Such direct derivations have been proposed by Raney [407] and 
are based on simple but cunning surgery performed on lattice path representations of 
trees (the “conjugation principle” of which a particular case is the “cycle lemma” of 
Dvoretzky—Motzkin [145]). 


Level profile of trees. The next example demonstrates the utility of complete 
generating functions for investigating the level profile of trees. 


EXAMPLE III.19. Trees and level profile. Given a rooted tree T, its level profile is defined as 
the vector (no, 1, 2,...) where n; is the number of nodes present at level j (i.e., at distance 7 
from the root) in tree 7. Continuing within the framework of a simple variety of trees, we now 
define the quantity Yn:ng,n ,nz to be the number of trees with size n and level profile given by 
the n;. The corresponding complete GF Y (z, u) with z marking size and u; marking nodes at 
level 7 is expressible in terms of the fundamental “generator” ¢: 


(51) Y(z,u) = zuod (zu1¢ (zu2¢ (zusd(---)))). 
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We may call this a “continued ¢-form”. For instance general Catalan trees have generator 
#(w) = (1 — w)~’, so that in this case the complete GF is the continued fraction: 


U0 Zz 


(52) Y(z,u) = 


U3Z 
(eee ees 


(See Section V.3 for complementary aspects.) In contrast, Cayley trees are generated by 
o(w) = e™, so that 


zuie* uze@3C 
Y(z,u) = zuoe*"? ; 
which is a “continued exponential’, that is, a tower of exponentials. Expanding such generating 
functions with respect to uo, u1,..., in order gives straightforwardly: 


Proposition III.8 (Level profile of trees). The number of trees of size n and level profile (no, n1,N2,..-) 
in a simple variety of trees defined by the “generator” (w) of (49) is 


Yning ni n2,--- =Wn-1° Cr Seer aia aa where ow = [w"]o(w)". 
There, the consistency conditions are no = 1 and pF nj; = n. In particular, the counts for 


general Catalan trees and for Cayley trees are respectively 


notni—-1 nmtne-1 n2+n3—1 (n = 1)! Ni n2,n3 
eae 5 = eo Ny N° eee 
ny ne n3 no!ni!ne!--- 


(Note that one must always have no = 1 for a single tree; the general formula with no 4 1 and 
Wn—1 replaced by wn—n, gives the level profile of forests.) The first of these enumerative results 


is due to Flajolet [168] and it places itself within a general combinatorial theory of continued 
fractions (Chapter V); the second one is due to Rényi and Szekeres [409] who developed such 
a formula in the course of a deep study relative to the distribution of height in random Cayley 
TREES Se Sea theta Athens A rare er EA La cal i ht Ae END OF EXAMPLE III.19. 


> IIL.28. Continued forms for path length. The BGF of path length are obtained from the level 
profile MGF by means of the substitution uw; +> q’. For general Catalan trees and Cayley trees, 
this gives 


(53) G(z,q) = ————, T(z, q) = 27° ; 


where g marks path length. The MGFs are ordinary and exponential respectively. (Combined 
with differentiation, such MGFs represent an attractive option for mean value analysis.) dq 
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Trees and processes. The next example is an especially important application of 
complete GFs, as these GFs provide a bridge between combinatorial models and a 
major class of stochastic processes, the branching processes of probability theory. 


EXAMPLE III.20. Weighted tree models and branching processes. Consider the family G of 
all general plane trees. Let A = (Xo, A1,...) be a system of numeric weights. The weight of 
a node of outdegree 7 is taken to be A; and the weight of a tree is the product of the individual 
weights of its nodes: 
(54) m(r) = Tax, 

j=0 
with x; (7) the number of nodes of degree j in 7. One can view the weighted model of trees as 
a model in which a tree receives a probability proportional to 7(w). Precisely, the probability 
of selecting a particular tree 7 under this model is, for a fixed size n 


te) 

Liri=an mT’) 
This defines a probability measure over the set G,, and one can consider events and random 
variables under this weighted model. 

The weighted model defined by (54) and (55) covers any simple variety of trees: just 
replace each \; by the quantity ¢; given by the “generator’ (49) of the model. For instance, 
plane unlabelled unary-binary trees are obtained by A = (1, 1,1,0,0,...), while Cayley trees 
correspond to A; = 1/j!. Two equivalence-preserving transformations are then especially 
important in this context: 


(55) Pg,,a(T) 


(i) Let A* be defined by Aj = cA; for some nonzero constant c. Then the weight cor- 
responding to A* satisfies 7*(7) = c'7!1(w). Consequently, the models associated 
to A and A* are equivalent as regards (55). 

(ii) Let A®° be defined by \f = 07; for some nonzero constant 6. Then the weight 
corresponding to A° satisfies 7°(r) = c!7!~1x(w), since >, Ixi(7) = |7| — 1 for 
any tree 7. Thus the models A° and A are again equivalent. 

Each transformation has a simple effect on the generator ¢, namely: 
(56) o(w) + *(w) = cd(w) and o(w) > °(w) = o(Ow). 
Once equipped with such equivalence transformations, it becomes possible to describe 


probabilistically the process that generates trees according to a weighted model. Assume that 
A; = 0 and that the A; are summable. Then the normalized quantities 
__ri 
Pj Sy Aj 

form a probability distribution over N. By the first equivalence-preserving transformation the 
model induced by the weights p; is the same as the original model induced by the A;. (By 
the second equivalence transformation, one can furthermore assume that the generator ¢ is the 
probability generating function of the p;.) 

Such a model defined by nonnegative weights {p; } summing to 1 is nothing but the classi- 
cal model of branching processes (also known as Galton- Watson processes) ; see [17]. In effect, 
a realization T of the branching process is classically defined by the two rules: (i) produce a 
root node of degree j with probability p;; (2i) if 7 > 1, attach to the root node a collection 
T,,...,Z} of independent realizations of the process. This may be viewed as the development 
of a “family” stemming from a common ancestor where any individual has probability p; of 
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giving birth to 7 children. Clearly, the probability of obtaining a particular finite tree 7 has 
probability 7(7), where 7 is given by (54) and the weights are A; = p;. The generator 


o(w) = S > pw! 
j=0 


is then nothing but the probability generating function of (one-generation) offspring, with the 
quantity 4 = ¢’(1) being its mean size. 

For the record, we recall that branching processes can be classified into three categories 
depending on the values of pu: 


Subcriticality: when 4 < 1, the random tree produced is finite with probability 1 
and its expected size is also finite. 

Criticality: when js = 1, the random tree produced is finite with probability 1 but its 
expected size is infinite. 

Supercriticality: when > 1, the random tree produced is finite with probability 
strictly less than 1. 


From the discussion of equivalence transformations (56), there furthermore results that, regard- 


ing trees of a fixed size n, there is complete equivalence between all branching processes with 
generators of the form 


es 
Such families of related functions are known as “exponential families” in probability theory. In 
this way, one may always regard at will the random tree produced by a weighted model of some 
fixed size n as originating from a branching process of subcritical, critical, or supercritical type 
conditioned upon the size of the total progeny. 
Finally, take a set S C G for which the complete generating function of S with respect to 
the degree profile is available, 


S(z,uo,ui,--.) = De Pa (ee? A ) 
TES 
Then, for a system of weights A, one has 
S(z, Xo, Ai, Se .) = ys; n(r)zl7, 
TES 


Thus, the probability that a weighted tree of size n belongs to S becomes accessible by extract- 
ing the coefficient of z”. This applies a fortiori to branching processes as well. In summary, 
the analysis of parameters of trees of size n under either weighted models or branching pro- 
cess models derives from substituting weights or probability values inside the corresponding 
combinatorial generating functions. «0.0.0.0... 00 c ccc END OF EXAMPLE III.20. 


The reduction of combinatorial tree models to branching processes has been pur- 
sued early, most notably by the “Russian School”: see especially the books by Kolchin 
[314, 315] and references therein. (For asymptotic purposes, the equivalence between 
combinatorial models and critical branching processes often turns out to be most fruit- 
ful.) Conversely, symbolic-combinatorial methods may be viewed as a systematic way 
of obtaining equations relative to characteristics of branching processes. We do not 
elaborate further along these lines as this would take us outside of the scope of the 
present book. 
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> IIl.29. Catalan trees, Cayley trees, and branching processes. Catalan trees of size n are 
defined by the weighted model in which A; = 1, but also equivalently by oe = cé, for 
any c > 0 and @ < 1. In particular they coincide with the random tree produced by the critical 
branching process whose offspring probabilities are geometric: p; = 1/27*1. 

Cayley trees are a priori defined by 4; = 1/j!. They can be generated by the critical 
branching process with Poisson probabilities, p; = e~1/j!, and more generally with an arbi- 
trary Poisson distribution p; = e~*d? /j). <J 


Ill. 7. Additional constructions 


We discuss here additional constructions already examined in earlier chapters, 
namely pointing and substitution (Section III.7.1) as well as order constraints (Sec- 
tion II.7.2) on the one hand, implicit structures (Section II. 7.3) on the other hand. 
Given that basic translation mechanisms can be directly adapted to the multivariate 
realm, such extensions involve basically no new concept, and the methods of Chap- 
ters I and II can be recycled. In Section HI. 7.4, we revisit the classical principle of 
inclusion-exclusion under a generating function perspective. In this light, the principle 
appears as a typically multivariate device well-suited to enumerating objects accord- 
ing the number of occurrences of sub-configurations. 


III. 7.1. Pointing and substitution. Let (7, .) be a class-parameter pair, where 
x is multivariate of dimension r > 1 and let F(z) be the MGF associated to it in 
the notations of (18) and (27). In particular z = z marks size, and z, marks the 
component & of the multiparameter x. If z marks size, then, like in the univariate case, 
0, translates the fact of distinguishing one atom. Generally, pick up a variable x = z; 
for some j with 0 < 7 < r. Then since 


tO, (stat) = f - (stt?x), 


the interpretation of the operator 0, = x0, is immediate; it means “pick up in all 
possible ways in objects of F a configuration marked by x and point to it”. For 
instance, if F'(z,u) is the BGF of trees where z marks size and u marks leaves, 
then 6,,F'(z,u) = ud, F(z, uw) enumerates trees with one distinguished leaf. 

Similarly, the substitution x ++ S(z) in a GF F’, where S(z) is the MGF of a 
class S, means attaching an object of type S to configurations marked by the variable x 
in F. We refrain from giving detailed definitions (that would be somewhat clumsy 
and uninformative) as the process is better understood by practice than by long formal 
developments. Justification in each particular case is easily obtained by returning to 
the combinatorial representation of generating functions as images of combinatorial 
classes. 


EXAMPLE III.21. Constrained integer compositions and “slicing”. This example illustrates 
variations around the substitution scheme. Consider compositions of integers where successive 
summands have sizes that are constrained to belong to a fixed set R C N?. For instance, the 
relations 


Ri = {(z,y)|1<a<y}, Ro = {(@,y)|1<y< 2a}, 
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correspond to weakly increasing summands in the case of 1 and to summands that can at most 
double at each stage in the case of 72. In the “ragged landscape” representation of composi- 
tions, this means considering diagrams of unit cells aligned in columns along the horizontal 
axis, with successive columns obeying the constraint imposed by R. 

Let F(z, u) be the BGF of such R-restricted compositions, where z marks total sum and wu 
marks the value of the last summand, that is, the height of the last column. The function F'(z, u) 
satisfies a functional equation of the form 


(57) F(z,u) = f(zu) + (LIF(, w))) 


where f(z) is the generating function of the one-column objects and CL is a linear operator over 
formal series in u given by 


(58) Lw}= SD ut, 


(GF, R)ER 


Ur> ZU ? 


In effect, Equation (57) describes inductively objects as comprising either one column (f(zw)) 
or else being formed by adding a new column to an existing one. In the latter case, the last 
column added has a size & that must be such that (j,k) € RR, if it was added after a column of 
size j, and it will contribute u* z* to the BGF F(z, uw); this is precisely what (58) expresses. In 
particular, F(z, 1) gives back the enumeration of F—objects irrespective of the size of the last 
column. 

For a rule R that is “simple enough”, the basic equation (57) will often involve a substi- 
tution. Let us first rederive in this way the enumeration of partitions. We take R = R11 and 
assume that the first column can have any positive size. Compositions into increasing summands 
are clearly the same as partitions. Since 


us 


Lh] =u) pul! pul? 4 = 


the function F'(z, u) satisfies a functional equation involving a substitution, 


ZU 


(59) F(z,u) = F(z, zu). 


1-— zu l-—zu 


This relation iterates: any linear functional equation of the substitution type 
o(u) = a(u) + B(u)o(o(u)) 
is solved formally by 


(60) $(u) = a(u) + B(u)a(o(u)) + B(u)B(o(u))a(o(u)) +--+ , 


where o“ (u) designates the jth iterate of u. 


FIGURE III.14. The technique of “adding a slice” for enumerating constrained compositions. 
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Returning to compositions into increasing summands, that is, partitions, the turnkey so- 
lution (60) gives, upon iterating on the second argument with the first argument treated as a 
parameter: 


ZU Zu Zu 


1— zu * (1 — zu)(1 — z?u) « 


Equivalence with the alternative form 


(61) F(z,u) = 


ZU uw us 
Gy: BAW Sa 25 Ganda) Gand een 
is then easily verified from (59) upon expanding F(z, wu) as a series in wu and applying the 
method of indeterminate coefficients to the form (1 — zu) F(z, u) = zu+ F(z, zu). The pre- 
sentation (62) is furthermore consistent with the treatment of partitions given in Chapter I since 
the quantity [u"]F(z,u) clearly represents the OGF of (nonempty) partitions whose largest 
summand is k. (In passing, the equality between (61) and (62) is a shallow but curious identity 


that is quite typical of the area of g-analogues.) 

This same method has been applied in [201] to compositions satisfying condition R2 
above. In this case, successive summands are allowed to double at most at each stage. The 
associated linear operator is 
1—u 


l-u- 


Llu] =ut---+u% =u 


For simplicity, it is assumed that the first column has size 1. Thus, F’ satisfies a functional 
equation of the substitution type: 
ZU 


F(z,u) = zut+ ar (F(z, 1) - F(z,27u’)) : 


This can be solved by means of the general iteration mechanism (60), treating momentarily 
F(z, 1) as a known quantity: with a(u) := zu + F(z,1)/(1 — zu), one has 


ZU ZU Zu 


F(z,u) = a(u) a(z*u*) 4 a(z°u*) — 


~ Tzu " T—zul— zu? 


Then, the substitution u = 1 in the solution becomes permissible. Upon solving for F'(z, 1), 
one eventually gets the somewhat curious GF for compositions satisfying 72: 


Dyer 1 22-42 /Q5-1(2) 


So jao(— 12" -F-2/Q5(2) | 
where Qj(z) = (1-2) —2)(1— 2") (1-2), 


F(z,1l) = 


The sequence of coefficients starts as 1,1, 2,3, 5, 9,16, 28, 50 and is EJS A002572: it repre- 


sents for instance the number of possible level profiles of binary trees, or equivalently the num- 
Dodd 

9294? Br° 

of solutions to Kraft’s inequality). See [201] for details including very precise asymptotic esti- 


ber of partitions of 1 into summands of the form 1 .. (this is related to the number 


mates and Tangora’s paper [464] for relations to algebraic topology. END OF EXAMPLE III.21. 


The reason for presenting the slicing method in some detail is that it is very gen- 
eral. It has been in particular employed to derive a number of original enumerations of 
polyominoes by area, a topic of interest in some branches of statistical mechanics: for 
instance, the book by Janse van Rensburg [482] discusses many applications of such 
lattice models to polymers and vesicles. See Bousquet-Mélou’s review paper [66] for 


190 Il. PARAMETERS AND MULTIVARIATE GFS 


a methodological perspective. Some of the origins of the method point to Polya in the 
1930’s, see [396], and independently to Temperley [466, pp. 65-67]. 


> IIl.30. Pointing-erasing and the combinatorics of Taylor’s formula. The derivative opera- 
tor 0, corresponds combinatorially to a “pointing-erasing” operation: select in all possible ways 
an atom marked by x and make it transparent to x-marking (e.g., by replacing it by a neutral 
object). The operator nor f(a), then corresponds to picking up in all possible way a subset 
(order does not count) of & configurations marked by x. The identity (Taylor’s formula) 


fe = (Fars) 


k>0 


can then receive a simple combinatorial interpretation: Given a population of individuals (F 
enumerated by f), form the bicoloured population of individuals enumerated by f(x + y), 
where each atom of each object can be repainted either in x-colour or y-colour; the process is 
equivalent to deciding a priori for each individual to repaint & of its atoms from z to y, this 
for all possible values of k > 0. Senn from combinatorics, Taylor’s formula thus expresses the 
equivalence between two ways of counting. <q 


> IIL.31. Carlitz compositions I. Let K be the class of compositions such that all pairs of 
adjacent summands are formed of distinct values. These can be generated by the operator 


Liu] = “~ — viz’, so that L[f(u)] = “{f(1) — f(uz). The BGF K(z,u), with u 


1l-uz l-uz 
marking the value of the last summand, then satisfies a functional equation, 


UZ UZ 


K(z,u) = K(z,1) — K(z, zu), 


1— uz ‘3 1—uz 
giving eventually A(z) = K(z, 1) under the form 
Vel 
(=z) 
K = 1 - 
(63) se ap Pes, 


j21 


Ltet 27 4322 44244 72° + 142° 4 2327 4 3028 4°... 


The sequence of coefficients constitutes EJS A003242. Such compositions have been introduced 
by Carlitz in 1976; the derivation above is from a paper by Knopfmacher and Prodinger [296] 
who provide early references and asymptotic properties. (We resume this thread in Note 34 
below and in Chapter IV, p. 249.) <q 


Il. 7.2. Order constraints. We refer in this subsection to the discussion of or- 
der constraints in labelled products that has been given in Chapter II. We recall that 
the modified labelled product 


A= (BU x«C) 


only includes the elements of (B « C) such that the minimal label lies in the A com- 
ponent. Once more the univariate rules generalize verbatim for parameters that are 
inherited and the corresponding exponential MGFs are related by 


Ate,u) = | (@.B(,w)-Clt.w) at 


To illustrate this multivariate extension, we shall consider a quadrivariate statistic on 
permutations. 
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valley: Oji-1 > 01 < O14 leaf node (wo) 


double rise: o;-1 < 0; < O44 unary right-branching (w1) 


double fall: o;-1 > 0; > O44 unary left-branching (u‘) 


peak: Oi-1 <0; > O14 binary node (u2) 


FIGURE III.15. Local order patters in a permutation and the four types of nodes in the 
corresponding increasing binary tree. 


EXAMPLE III.22. Local order patterns in permutations. An element o; of a permutation 
written 0 = 01,...,0n when compared to its immediate neighbours can be categorized into 
one of four types* summarized in the first two columns of Figure 15. The correspondence with 
binary increasing trees described in Example 17 of Chapter II then shows the following: peaks 
and valleys correspond to binary nodes and leaves, respectively, while double rises and double 
falls are associated with right-branching and left-branching unary nodes. Let uo, ui, u4, U2 be 
markers for the number of nodes of each type, as summarized in Figure 15. Then the exponential 
MGEF of increasing trees under this statistic satisfies 


Zit, u) = uo + (ur + uh) I (z, u) + ual (z,u)?. 


This is solved by separation of variables as 

6 v1 + d6tan(zd) wv 
64 I Os eee 
e4) (z,u) u26—vitan(zd) ue 


where the following abbreviations are used: 


i 
v1 = 5 (ta +u1), 5 = /ugu2 — v?. 
3 


2 
I =wuo0z + uo(ui + w)s + uo ((u1 + uy)? + 2uou2) a7 


which agrees with the small cases. This calculation is consistent with what has been found in 
Chapter II regarding the EGF of all nonempty permutations and of alternating permutations, 


One has 


z 
—, t 
1—2? an(z), 


that derive from the substitutions {uo U1 Uh U2 1} and {wo = ue = 1lhwi = 
u‘, = O}, respectively. The substitution {ug = ui = u, ui = ue = 1} gives a simple variant 
(without the empty permutation) of the BGF of Eulerian numbers (73) derived below by other 
means (p. 197). 

By specialization of the quadrivariate GF, there results that, in a tree of size n the mean 


number of nodes of nullary, unary, or binary type is asymptotic to n/3, with a variance that is 
O(n), thereby ensuring concentration of distribution. ........ END OF EXAMPLE III.22. 


A similar analysis yields path length. It is found that a random increasing binary 
tree of size n has mean path length 


2nlogn + O(n). 


4Here, for |o| = n, we regard o as bordered by (—0o, —00), i.e., we set 79 = On41 = —0o and let 
the index 7 in Figure 15 vary in [1..n]. Alternative bordering conventions prove occasionally useful. 
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V 


FIGURE III.16. The level profile of a random increasing binary tree of size 256. (Com- 
pare with Figure 13 for binary trees under the uniform Catalan statistic.) 


Contrary to what the uniform combinatorial model gives, such trees tend to be rather 
well balanced, and a typical branch is only about 38.6% longer than in a perfect binary 
tree (since 2/ log 2 = 1.386). This fact applies to binary search trees (Note 32) and 
it justifies that the performance of such trees is quite good when they are applied to 
random data [307, 351, 434] or subjected to randomization [416, 370]. 
> III.32. Binary search trees (BSTs). | Given a permutation 7, one defines inductively a tree 
BST(T) by 

BST(e) = 9; BST(T) = (71, BST(T| <7, ), BST(T|>7, ))- 
(There, 7|p represents the subword of 7 consisting of those elements that satisfy predicate P.) 
Let IBT(c) be the increasing binary tree canonically associated to 0. Then one has the funda- 
mental Equivalence Principle, 


BT(o) =P’ Bst(o~), 


shape 
where A’ = B means that A and B have identical tree shapes. <q 


Ill. 7.3. Implicit structures. Here again, we note that equations involving sums 
and products, either labelled or not, are easily solved just like in the univariate case. 
The same applies for the sequence construction and for the set construction, especially 
in the labelled case—refer to the corresponding sections of Chapters I and I. Again, 
the process is best understood by examples. 

Suppose for instance one wants to enumerate connected labelled graphs by the 
number of nodes (marked by z) and the number of edges (marked by wu). The class 
of connected graphs and the class G of all graphs are related by the set construction, 


G = SET{K}, 


meaning that every graph decomposes uniquely into connected components. The cor- 
responding exponential BGFs then satisfy 


G(z,u) = eke) implying K(z,u) = log G(z, u), 
since the number of edges in a graph is inherited (additively) from the corresponding 


numbers in connected components. Now, the number of graphs of size n having k 
edges is CAP), so that 


(65) K(z,u) = log (: + SoU + ynonnz) ; 
n! 
n=1 
This formula, which appears as a refinement of the univariate formula of Chapter I, 


then simply reads: connected graphs are obtained as components (the log operator) of 
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general graphs, where a general graph is determined by the presence or absence of an 
edge (corresponding to (1+ u)) between any pair of nodes (the exponent n(n — 1)/2). 

Pulling out information out of the formula (65) is however not obvious due to the 
alternation of signs in the expansion of log(1 + w) and due to the strongly divergent 
character of the involved series. As an aside, we note here that the quantity 


K(z,u) =K (=,u) 


enumerates connected graphs according to size (marked by z) and excess (marked 
by u) of the number of edges over the number of nodes. This means that the results of 
Section 5.3 of Chapter II obtained by Wright’s decomposition can be rephrased as the 
expansion (within C(w)[z]): 


oo n,,—n 1 
joe (14 So enone = =Wa(2) + Wol2) ++ 
ee The A es 1 ‘Ss 1 
Ss  Dpa ag [ora ay 9 
= ( 2 ) + (S08 2 4 )+ 


with T = T(z). See Temperley’s early works [465, 466] as well as the “giant paper on 
the giant component” [282] and the paper [205] for direct derivations that eventually 
constitute analytic alternatives to Wright’s combinatorial approach. 


EXAMPLE III.23. Smirnov words. Following the treatment of Goulden and Jackson [244], 
we define a Smirnov word to be any word that has no consecutive equal letters. Let W = 
SEQ{A} be the set of words over the alphabet A = {a1,...,a,} of cardinality r, and S be the 
set of Smirnov words. Let also vj; mark the number of occurrences of the jth letter in a word. 
One has? 

_ 1 

— L-(u +--+ +r) 

Start from a Smirnov word and substitute to any letter a; that appears in it an arbitrary nonempty 
sequence of letters a;. When this operation is done at all places of a Smirnov word, it gives 
rise to an unconstrained word. Conversely, any word is associated to a unique Smirnov word 
by collapsing into single letters maximal groups of contiguous equal letters. In other terms, 
arbitrary words derive from Smirnov words by a simultaneous substitution: 


W (v1, oe ., Ur) 


W= S[ai nd SEQsi {a1}, see 5 Ar SEQS1{ar}] f 
There results the relation 
U1 Ur 
7 eS SS 
(67) W(v1,..-,Ur) (4 ) 
This relation determines the MGF S(v1,..., vr) implicitly. Now, since the inverse function of 


v/(1 — v) is v/(1 + v), one finds the solution: 


e si 
= V1 Ur _ = Uj 
(68) Stee =W (TE og) = (1 ys) 


>The variable z marking length being here unnecessary, it is omitted—it would otherwise somewhat 
obscure the simplicity of the calculations. 
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For instance, if we set vj; = z, that is, we “forget” the composition of the words into letters, 
we obtain the OGF of Smirnov words counted according to length as 


1 1+z n—-lin 
SS a -—1 : 
l—-r- 1-(r—-lz Peeps: 


1+z n>1 


This is consistent with elementary combinatorics since a Smirnov word of length n is deter- 
mined by the choice of its first letter (r possibilities) followed by a sequence of n — 1 choices 
constrained to avoid one letter amongst r (and corresponding to r — 1 possibilities for each 
position). The interest of (68) is to apply equally well to the Bernoulli model where letters may 
receive unequal probabilities and where a direct combinatorial argument does not appear to be 
easy: it suffices to perform the substitution v; ++ p;z in this case: see Example IV.9, p. 249 
and Note V.7, p. 289 

From these developments, one can next build the GF of words that never contain more 
than m consecutive equal letters. It suffices to effect in (68) the substitution vj; +> vj +--++u;". 
In particular for the univariate problem (or, equivalently, the case where letters are equiproba- 
ble), one finds the OGF 


1 ya zmtt 


“Sf ~ Toret(r—Dertt 
7 era 
1+255 


T= 


This extends to an arbitrary alphabet the analysis of single runs and double runs in binary words 
that was performed in Section 4 of Chapter I. Naturally, this approach applies equally well to 
nonuniform letter probabilities and to a collection of run-length upperbounds and lowerbounds 
dependent on each particular letter. This topic is in particular pursued by different methods in 
several works of Karlin and coauthors (see, e.g., [365]), themselves motivated by applications 
to life:sciencess, steven ia aig i oe aa END OF EXAMPLE III.23. 


[> 111.33. Enumeration in free groups. Consider the composite alphabet B = A U A, where 
A = {a1,...,ar} and A= {az,...,@,}. A word over alphabet B is said to be reduced if it 
arises from a word over B by a maximal application of the reductions a;@j +> € and Gja; +> € 
(with € the empty word). A reduced word thus has no factor of the form a;aj or @ja;. Such a 
reduced word serves as a canonical representation of an element in the free group F’,. generated 
by A, upon identifying Gj = Gs". The GF of reduced words with w; and Uj marking the 
number of occurrences of letter a; and Gj, respectively, is 


Rwy. ythr Tyo Te) = 8 ( 


U1 UL Ur Ur 
1- U1 1- U1 , 


with S the GF of Smirnov words, as in (68). In particular this specializes to give the OGF of 
reduced words with z marking length, R(z) = (1 + z)/(1 — (2r — 1)z): implying Rn = 
2r(2r — 1)", which checks with what elementary combinatorics gives. 

The Abelian image \(w) of an element w of the free group F;, is obtained by letting 
all letters commute and applying the reductions a; - a;* = 1. It can then be put under the 


form aj"! ---at"", with each m, in Z, so that it can be identified with an element of Z’. 


Let x = (x1,...,%,) be a vector of indeterminates and define x) to be the monomial 
ay t-++-a%"", Of interest in certain group-theoretic investigations is the MGF 
N ZL zavt zx zea! 
a 
Q(z3x) := x gly Sg a sy ee) + — ]> 
1-241 l— za} 1-22, 1l— zx, 


wEeR 
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which is found to simplify to 

12? 
12024 (a5 +27") + (@r — 12?” 
This last form appears in a paper of Rivin [412], where it is obtained by matrix techniques. 
Methods developed in Chapter IX can then be used to establish central and local limit laws for 


the asymptotic distribution of A(w) over Rn, providing an alternative to the methods of [412, 
435]. (This note is based on an unpublished memo of Flajolet, Noy, and Ventura, 2006.) dq 


Q(z;x) = 


> W1.34. Carlitz compositions I. Here is an alternative derivation of the OGF of Carlitz 
compositions (Note 31, p. 190). Carlitz compositions with largest summand < r are obtained 


from the OGF of Smirnov words by the substitution v; > 27: 


Tr 


-1 
J 
) e)=(1- O25) 


j=1 


The OGF of all Carlitz compositions then results from letting r — oo: 


love} j = 
(70) «a= (1-075) 


j=l 


The asymptotic form of the coefficients is derived in Chapter IV, p. 249. <q 


Il. 7.4. Inclusion-Exclusion. Inclusion-exclusion is a familiar type of reason- 
ing rooted in elementary mathematics. Its principle, in order to count exactly, consists 
in grossly overcounting, then performing a simple correction of the overcounting, then 
correcting the correction, and so on. Characteristically, enumerative results provided 
by inclusion exclusion involve an alternating sum. We revisit this process here in the 
perspective of multivariate generating functions, where it essentially reduces to a com- 
bined use of substitution and implicit definitions. Our approach follows Goulden and 
Jackson’s encyclopedic treatise [244]. 

Let € be a set endowed with a real or complex valued measure | - | in such a way 
that, for A, B C €, there holds 


|AU B] = |A| + |B whenever ANB=9. 


Thus, | - | is an additive measure, typically taken as set cardinality (i.e., |e] = 1 for 
e € E) ora discrete probability measure on E (i.e., |e| = pe for e € EL). The general 
formula 


|AU B| = |A| + |B| — |AB| where AB:= ANB, 
follows immediately from basic set-theoretic principles: 
De lel = De ede = Dati 
c€ AUB acA beB i€ ANB 


What is called the inclusion-exclusion principle or sieve formula is the following mul- 
tivariate generalization, for an arbitrary family A,,..., A, C E: 
(71) 


|A, U---UA,| 


|E \ (AyA2- + -A,)| where A:=€\A 
= So) |Al- $5 |Ae Aeg| +-+> + (-1)""]A1 Ao +++ Arl. 


1<i<r 1<i1 <ig<r 
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(The easy proof by induction results from elementary properties of the boolean algebra 
formed by the subsets of €; see, e.g., [98, Ch. IV].) An alternative formulation results 
from setting B; = A;, Bj; = Aj: 


|BiB2---B,|=|E€|— >> [Bil 
(72) cea si Se ae 
+ S° |B, Bi,|—---+(-1)"BiB2---B,l. 
1<t1<io<r 
In terms of measure, this equality quantifies the set of objects satisfying exactly a 
collection of simultaneous conditions (all the 6;) in terms of those that violate at 
least some of the conditions (the B;). 


Derangements. Here is a textbook example of an inclusion—exclusion argument, 
namely, the enumeration of derangements. Recall that a derangement is a permuta- 
tion o such that o; 4 2, for all 7. Fix € as the set of all permutations of [1, n], take 
the measure | - | to be set cardinality, and let B; be the subset of permutations in € 
associated to the property 0; 4 i. (There are consequently r = n conditions.) Thus, 
B; means having no fixed point at i, while B; means having a fixed point at the dis- 
tinguished value 7. Then, the left hand side of (72) is the number of permutations that 
are derangements, that is, D,,. As regards the right hand side, the kth sum comprises 
itself (7) terms counting possibilities attached to the choices of indices 71 <-++ < tx; 
each such choice is associated to a factor B;, --- B;, that describes all permutations 
with fixed points at the distinguished points 71, ..., 7, (Le., 0(#1) = t1,.--, 0%, = tr). 
Clearly, |B;, --- B;,,| = (n — k)!. Therefore one has 


Dy, =n! - (") (n—1)!+ (5) (=) eee ayn(*Jo. 


which rewrites into the more familiar form 

Si, ef Sah intl a ee 

n! 1! 2! nt 
This gives an elementary derivation of the derangement numbers already encountered 
in Chapter II and obtained there by means of the labelled set and cycle constructions. 


The derivation above is perfectly fine but carrying it out on complex examples 
may represent somewhat of a challenge. In contrast, as we now explain, there exists 
a parallel approach based on multivariate generating functions, which is technically 
easy to deal with and has great versatility. 

Let us now reexamine derangements in a generating function perspective. Con- 
sider the set P of all permutations and build a superset QO as follows. The set Q 
is comprised of permutations in which an arbitrary number of fixed points—some, 
maybe none, not necessarily all—have been distinguished. (This corresponds to ar- 
bitrary products of the B; in the argument above.) For instance Q contains elements 
like 

13,2, 1,3,2, 1,2,3, 1,2,3, 1,2,3, 1,2,3, 
where distinguished fixed points are underlined. Clearly, if one removes the distin- 
guished elements of a y € Q, what is left constitutes an arbitrary permutation of the 
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remaining elements. One has 
O=Ux«P, 
where U/ denotes the class of urns that are sets of atoms. In particular, the EGF of Q is 
Q(z) = e*/(1 — z). What we’ve just done is to enumerate the quantities that appear 
in (72), but with the signs “wrong”, i.e., all pluses. 
Introduce now the variable v to mark the distinguished fixed points in objects 
of Q. The exponential BGF is then by general principles of this chapter: 


UZ 1 
Q(z,v) =e fas 


Let now P(z,u) be the BGF of permutations where u marks the number of fixed 
points. (Let us ignore momentarily the fact that P(z, u) is otherwise known.) Per- 
mutations with some fixed points distinguished are generated by the substitution u +> 
1+ v inside P(z,u). In other words one has the fundamental inclusion-exclusion 
relation 


Q(z,v) = P(z,1+). 


This is then easily solved as 


P(z,u) = Q(z,u—1), 


so that knowledge of (the easy) Q gives (the harder) P. For the case at hand, this 
yields 
u—l1)z eo? 


el 
P(z,0) = D(z) = ; 


1-2? baz 
and, in particular, the EGF of derangements has been retrieved. Note that the sought 
P(z,0) comes out as Q(z, —1), so that signs corresponding to the sieve formula (72) 
have now been put “right”, i.e., alternating. 

The process employed for derangements is clearly very general. It is a generating 
function analogue of the inclusion-exclusion principle: counting objects that satisfy a 
number of simultaneous constraints is reduced to counting objects that violate some of 
the constraints at distinguished “places”—the latter is usually a simpler problem. The 
generating function analogue of inclusion-exclusion is then simply the substitution 
vt u— 1, if a bivariate GF is sought, or v + —1 in the univariate case. 


P(z,u) = 


Rises in permutations and patterns in words. The book by Goulden and Jack- 
son [244, pp. 45-48] describes a useful formalization of the inclusion process operat- 
ing on MGFs. Conceptually, it combines substitution and implicit definitions. Once 
again, the modus operandi is best grasped through examples, two of which are detailed 
below. 


EXAMPLE III.24. Rises and ascending runs in permutations. A rise (also called an ascent) 
in a permutation 0 = 01 ---o» is a pair of consecutive elements o;0;+1 satisfying 0; < oi41 
(with 1 < i < n). The problem is to determine the number A,,,, of permutations of size having 
exactly k rises, together with the BGF A(z, u). By symmetry, we are also enumerating descents 
(defined by o; > 0441) as well as ascending runs that are each terminated by a descent. 
Guided by the inclusion-exclusion principle, we tackle the easier problem of enumerating 
permutations with distinguished rises, of which the set is denoted by B. For instance, 6 contains 
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elements like 


261) 3,74,78,79,11 |15 12/5,710| 13714, 


where those rises that are distinguished are represented by arrows. (Note that some rises may 
not be distinguished.) Maximal sequences of adjacent distinguished rises (boxed in the repre- 
sentation) will be called clusters. Then, B can be specified by the sequence construction applied 
to atoms (Z) and clusters (C) as 


B= SEQ(Z +0), where C=(Z /72Z)4+(Z2/2Z 7 Z) +--+ = SETs2(Z). 


since a cluster is an ordered sequence, or equivalently a set, furthermore having at least two 
elements. This gives the EGF of B as 


1 1 


HOST Gg @Sl=o) 2-8 


which happens to coincide with the EGF of surjections. 
For inclusion-exclusion purposes, we need the BGF of 6 with v marking the number of 
distinguished rises. A cluster of size k contains k — 1 rises, so that 


1 v 


BUM) = TS ener ala) eto 


Now, the usual argument applies: the BGF A(z, u) satisfies B(z,v) = A(z,1-+ v), so that 
A(z,u) = B(z,u— 1), which yields the particularly simple form 


u—1l 
u— ez(u-1)" 


(73) A(z,u) = 


In particular, this GF expands as 

2 3 4 
2 ee Ag EOP le ie 1) 
2! 3! A! 
The coefficients A,,, are known as the Eulerian numbers. In combinatorial analysis, these 
numbers are almost as classic as the Stirling numbers. A detailed discussion of their properties 
is to be found in classical treatises like [98] or [248]. (From Eq. (73), permutations without 
rises are enumerated by B(z, —1) = e*, an altogether obvious result.) 

Moments derive easily from an expansion of (73) at u = 1, which gives 


A(z,u) =1+2z4+(u+1) He stes 


In particular: the mean of the number of rises in a random permutation of size n is 5(n — 1) 


and the variance is ~ an ensuring concentration of distribution. 

The same method applies to the enumeration of ascending runs: for a fixed parameter @, 
an ascending run of length @ is a sequence of consecutive elements oj0;+1---@i+¢ such that 
Oi < Oi41 < +++ < oi4e. (Thus, arise is an ascending run of length 1.) We define a cluster as a 
sequence of distinguished runs which overlap in the sense that they share some of the elements 
of the permutation. The exponential BGF of permutations with distinguished ascending runs is 
then 

B(z,v) = : 


7 1—z—-I(z,v) 


n 

a Re 

; where I(z,v) = y In,bv Pa 
n,k . 


and [,, x is the number of ways of covering the segment [1, n] with & distinct intervals of length @ 
that are contained in [1, n] and have integral end points. The numbers J;,,, themselves result 
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from elementary combinatorics (see also the case of patterns in words below) and one has for 
the OGF corresponding to I: 


£+1 


Ci io 
(z,v) 2. Oe = aD) 


(Proof: The first segment in the covering must be placed on the left, the other ones appear in 
succession, each shifted right by 1 to @ positions from the previous one.) The last two equations 
finally determine the exponential BGF of permutations with size marked by z and ascending 
runs of length @ + 1 marked by u, 


(74) A(z,u) = B(z,u-—1), 


given the inclusion-exclusion principle. 

The resulting formule are checked to generalize the case of rises (€ = 1). They can 
be made explicit by first expanding the OGF /(z, v) into partial fractions, then applying the 
transformation (1 — wz)~' + e”? in order to translate I(z, v) into T(z, v). The net result is 


£ 
1 > : 

————————_., where I(z,v) =(1—z)(v+1)+ cj (v)e29 

ee (2.0) = (1-2) +1) + Dejlv) 


j=l 


A(z,u) = 


involves a sum of exponentials. In this last equation, the w;(v) are the roots of the characteristic 
equation w’ = v(1 +--- + w*) and the c;(v) are the corresponding coefficients in the 
partial fraction decomposition of I(z,v). These expressions were first published by Elizalde 
and Noy [150] who obtained them by means of tree decompositions. 

The BGF (74) can be exploited in order to determine quantitative information on long runs 
in permutations. First, an expansion at u = 1 (also, a direct reasoning: see the discussion of 
hidden words in Chapter I) shows that the mean number of ascending runs of length @ — 1 is 
(n — £+ 1)/é! exactly, as soon as n > &. This entails that, if n = o(£!), the probability of 
finding an ascending run of length @ — 1 tends to 0 as n — oo. What is used in passing in this 
argument is the general fact that for a discrete variable X with values in 0,1,2,..., one has 
(with Iverson’s notation) 


P(X > 1) =E({X > 1) = E(min(X, 1)) < E(X). 


An inequality in the converse direction can be obtained from the second moment method. In 
effect, the variance of the number of ascending runs of length @ — 1 is found to be of the exact 
form agn + Be where a is essentially 1/€! and (¢ is of comparable order (details omitted). 
Then, by Chebyshev’s inequalities, concentration of distribution holds as long as @ is such that 
(€ + 1)! = o(n). In this case, with high probability (i.e., with probability tending to 1 as n 
tends to oo), there are many ascending runs of length @ — 1. In particular: 


Let Ly, be the length of the longest ascending run in a random permutation of n 
elements. Let €o(n) be the smallest integer such that €! > n. Then the distribution 
of Lm is concentrated in the sense that Ly /€o(n) converges in probability to 1: for 
any € > 0, one has 


Ln 
lim P{l—-e< <l+e]=1. 
n—00 ( £o(n) ) 


What has been found here is a fairly sharp threshold phenomenon. END OF EXAMPLE III.24. 
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> IIL.35. Permutations without ¢—ascending runs. The EGF of permutations without 1-, 2— 
and 3—ascending runs are respectively 
-1 -1 -1 
rt qeith Pe gith tt gtith 
> pl GETD! SJ aa GEE ’ Ga ED 


i>0 i>0 i>0 


and so on. (See Carlitz’s review [78] as well as Elizalde and Noy’s article [150] for interesting 
results involving several types of order patterns in permutations.) dq 

Many variations on the theme of rises and ascending runs are clearly possible. Lo- 
cal order patterns in permutations have been intensely researched, notably by Carlitz 
in the 1970’s. Goulden and Jackson [244, Sec. 4.3] offer a general theory of patterns 
in sequences and permutations. Special permutations patterns associated with binary 
increasing trees are also studied by Flajolet, Gourdon, and Martinez [185] (by combi- 
natorial methods) and Devroye [125] (by probabilistic arguments). On another regis- 
ter, the longest ascending run has been found above to be of order (log n)/ log log n 
in probability. The superficially resembling problem of analysing the length of the 
longest increasing sequence in random permutations (elements must be in ascending 
order but need not be adjacent) has attracted a lot of attention, but is considerably 
harder. This quantity is ~ 2,/n on average and in probability, as shown by a pene- 
trating analysis of the shape of random Young tableaus due to Logan, Shepp, Vershik, 
and Kerov [336, 485]. Solving a problem open for over 20 years, Baik, Deift, and Jo- 
hansson [19] have eventually determined its limiting distribution. The undemanding 
survey by Aldous and Diaconis [7] discusses some of the background of this prob- 
lem, while Chapter VIII shows how to derive bounds that are of the right order of 
magnitude but rather crude, using saddle-point methods. 


EXAMPLE III.25. Patterns in words. Take the set of all words W = SEQ{A} over a 
finite alphabet A = {ai,...,a,r}. A pattern p = pip2--- pe, which is particular word of 
length k has been fixed. What is sought is the BGF W(z, u) of W, where u marks the number 
of occurrences of pattern p inside a word of W. Results of Chapter I already give access to 
W (z,0), which is the OGF of words not containing the pattern. 

In accordance with the inclusion-exclusion principle, one should introduce the class ¥ of 
words augmented by distinguishing an arbitrary number of occurrences of p. Define a cluster 
as a maximal collection of distinguished occurrences that have an overlap. For instance, if 
p = aaaaa, a particular word may give rise to the particular cluster: 


abaaaaaaaaaaAaAaAaAbaaaAaAaAaaabhbnhb 


aaaaa 
aaaaa 
aaaaa 


Then objects of 1 decompose as sequences of either arbitrary letters from A or clusters: 
X = SEQ(A+C), 


with C the class of all clusters. 

Clusters are themselves obtained by repeatedly sliding the pattern, but with the constraint 
that it should constantly overlap partly with itself. Let c(z) be the autocorrelation polynomial 
of p as defined in Chapter I, and set ¢(z) = c(z) — 1. A moment’s reflection should convince 
the reader that z*¢(z)°~' when expanded describes all the possibilities for forming clusters 
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of s overlapping occurrences. On the example above, one has ¢c(z) = z + 24234 24, and 
a particular cluster of 3 overlapping occurrences corresponds to one of the terms in z*¢(z)? as 
follows: 


z 
——— 5 
aaaaa ,2 z 
-—~ ‘ 
aaaaa 4 x (z+22 4+ 23 4+ 24) 
—_— 
aaaaa x (zt27 +23 42%). 


The OGF of clusters is consequently C'(z) = z*/(1 — €(z)) since this quantity describes all the 
ways to write the pattern (z”) and then slide it so that it should overlap with itself (this is given 
by (1 — G(z))~'). A slightly different way of obtaining this expression of C’(z) is described in 
Note 38 below. 

By a similar reasoning, the BGF of clusters is vz" /(1 — vé(z)), and the BGF of ¥ with 
the supplementary variable v marking the number of distinguished occurrences is 


1 


AG) = Spee =) 


Finally, the usual inclusion-exclusion argument (change v to u — 1) yields W(z,u) = 
X(z,u— 1). Asa result: 


For a pattern » with correlation polynomial c(z) and length k, the BGF 
of words over an alphabet of cardinality r, where u marks the number of 
occurrences of fp, is 


s. (u—1)ce(z) —u 
WE) Fars Dee) =a) 
The specialization w = 0 gives back the formula already found in Chapter I. The same 


principles clearly apply to weighted models corresponding to unequal letter probabilities, pro- 
vided a suitably weighted version of the correlation polynomial is introduced (Note 38 below). 
END OF EXAMPLEIII.25. 


There are a very large number of formule related to patterns in strings. For 
instance, BGFs are known for occurrences of one or several patterns under either 
Bernoulli or Markov models; see Note 38 below. We refer to Szpankowski’s book [458] 
and Lothaire’s chapter [280], where such questions are treated systematically in great 
detail. Bourdon and Vallée [65] have even succeeded in extending this approach to 
dynamical sources of information, thereby extending a large number of previously 
known results. Their approach even makes it possible to analyse the occurrence of 
patterns in continued fraction representations of real numbers. 


> III.36. Moments of number of occurrences. The derivatives of X(z,v) at v = 0 give access 
to the factorial moments of the number of occurrences of a pattern. In this way or directly, one 
determines 

1 Ze z*((1—rz)(ce(z) — 1) +.2*) (u—1)? 


Ge TD ae 
l—-rz = (1 — rz)? ted) (1 — rz) 2! iF 


W (z,u) 
The mean number of occurrences is r~” times the coefficient of z” in the coefficient of (u — 1) 
and is (n — k + 1)r~*, as anticipated. The coefficient of (u — 1)? /2! is of the form 


Qr—2h Qr—*(1 + 2kr-* — c(1/r)) P(z) 
(l—rz)8 ap (1 — rz)? es er 
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with P a polynomial. There results that the variance of the number of occurrences is of the 
form 

an+ 8, a=r *(2e(1/r) —1+r7*(1 — 2k)). 
Consequently, the distribution is concentrated around its mean. (See also the discussion of 
“Borges’ Theorem” in Chapter I, p. 58.) dq 


[> IIL.37. Words with fixed repetitions. Let W‘*)(z) = [u]W(z,u) be the OGF of words 
containing a pattern exactly s times. One has, for s > 0 and s = 0 respectively, 


k s-1 
with N(z) and D(z) given by 

N(z) =(1—rz)(e(z) —1) + 2*, D(z) = (1—rz)e(z) + 2*. 
The expression of W°°) is in agreement with Chapter I, Equation (48). <q 


> III.38. Patterns in Bernoulli sequences. Let A be an alphabet where letter a has probabil- 
ity 7 and consider the Bernoulli model where letters in words are chosen independently. Fix a 
pattern p = p1 --- px and define the finite language of protrusions as 


T= U {piri pit2° +: pr}, 
4: ¢;,A0 
where the union is over all correlation positions of the pattern. Define now the correlation 
polynomial +(z) (relative to p and the 7.) as the generating polynomial of the finite language 
of protrusions weighted by 7. For instance, p = ababa gives rise to 1 = {e, ba, baba} and 
y(z) =1+ Tate + nem. 


Then, the BGF of words with z marking length and u marking the number of occurrences of p 
is 


(u-1)7y(z) -u 
Wz, 4) = 
9) = yu Dye) — a) + we Da 
where 7[p] is the product of the probabilities of letters of p. <q 


> III.39. Patterns in binary trees. Consider the class B of pruned binary trees. An occurrence 
of pattern t in a tree 7 is defined by a node whose “dangling subtree” is isomorphic to t. Let p 
be the size of t. The BGF B(z, wu) of class B where u marks the number of occurrences of t is 
sought. 

The OGF of B is B(z) = (1 — V1 — 4z)/(2z). The quantity vB(zv) is the BGF of B 
with v marking external nodes. By virtue of the pointing operation, the quantity 


Ue = (Fai wow) 


v=l1 


describes trees with k distinct external nodes distinguished (pointed). The quantity 
Vi:= S> Upu*(z?)* satisfies V = (vB(zv)) 


by virtue of Taylor’s formula. It is also the BGF of trees with distinguished occurrences of t. 
Setting v + u — 1in V then gives back B(z, u) as 


v=l1+uzP? 


Biz,u) = x (1 1—4z-4(u 1) F 


In particular 


: (1 = rae 421) 


gives the OGF of trees not containing pattern t. The method generalizes to any simple variety 
of trees and it can be used to prove that the factored representation (as a directed acyclic graph) 
of a random tree of size n has expected size O(n/,/log 7); see [209]. 
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II. 8. Extremal parameters 


Apart from additively inherited parameters already examined at length in this 
chapter, another important category is that of parameters defined by a maximum rule. 
Two major cases are the largest component in a combinatorial structure (for instance, 
the largest cycle of a permutation) and the maximum degree of nesting of construc- 
tions in a recursive structure (typically, the height of a tree). In this case, bivariate 
generating functions are of little help. The standard technique consists in introducing 
a collection of univariate generating functions defined by imposing a bound on the 
parameter of interest. Such GFs can then be constructed by the symbolic method in 
its univariate version. 


III. 8.1. Largest components. Consider a construction B = ®{A}, where ® 
may involve an arbitrary combination of basic constructions, and assume here for 
simplicity that the construction for 6 is a non-recursive one. This corresponds to a 
relation between generating functions 


B(z) = V[A(z)], 


where W is the functional that is the “image” of the combinatorial construction ®. 
Elements of A thus appear as components in an object 3 € B. Let B‘) denote the 
subclass of 6 formed with objects whose A—components all have a size at most b. The 
GF of B‘°) is obtained by the same process as that of B itself, save that A(z) should 
be replaced by the GF of elements of size at most b. Thus, 


BO (z) = V[TpA(z)], 


where the truncation operator is defined on series by 


b ee) 
Ti(2)= Si faz” 2) = faz”). 
n=0 n=0 


Several cases of this situation have already been encountered in earlier chapters. 
For instance, the cycle decomposition of permutations translated by 


pijmesae | 


gives more generally the EGF of permutations with longest cycle < b, 


2 b 
PO(2)=ep(F4 F445), 


which involves the truncated logarithm. Similarly, the EGF of words over an m—ary 
alphabet 


W(z) = (e7)™ 


leads to the EGF of words such that each letter occurs at most b times: 


2 b m 
() (4) = Ee egecllge Denes hee 
W @=(+5+F+ +7 j 
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which now involves the truncated exponential. One finds similarly the EGF of set 
partitions with largest block of size at most b, 


i" 2! ob 
A slightly less direct example is that of the longest run in a sequence of binary 
draws. The collection W of binary strings over the alphabet {a, b} admits the decom- 
position 


2 b 


W = SEQ(a) - SEQ(b SEQ(a)), 
corresponding to a “scansion” dictated by the occurrences of the letter b. The corre- 
sponding OGF then appears under the form 
1 1 
‘Toa where Y (z) = [= 
corresponds to Y = SEQ(a). Thus, the OGF of strings with at most k — 1 consecutive 
occurrences of the letter a obtains upon replacing Y (z) by its truncation: 


W(z) =Y(z) 


1 
We Ve) arm Wie a Vb eh eee 
so that P 
1l-z 
(k) ek ee 
eye) 1-224 zk+1" 


Such generating functions are thus easy to derive. The asymptotic analysis of 
their coefficients is however often hard when compared to additive parameters, owing 
to the need to rely on complex analytic properties of the truncation operator. The bases 
of a general asymptotic theory have been laid by Gourdon [246]. 
> III.40. Smallest components. The EGF of permutations with smallest cycle of size > b is 

exp(—4 — Fe =) 

1l-z , 

A symbolic theory of smallest components in combinatorial structures is easily developed as 
regards GFs. Elements of the corresponding asymptotic theory are provided by Panario and 
Richmond in [385]. dd 


Ill. 8.2. Height. The degree of nesting of a recursive construction is a general- 
ization of the notion of height in the simpler case of trees. Consider for instance a 
recursively defined class 

B= O{B}, 
where © is a construction. Let B!") denote the subclass of B composed solely of ele- 
ments whose construction involves at most h applications of &. We have by definition 


Bieri — {pl}. 
Thus, with W the image functional of construction ®, the corresponding GFs are de- 
fined by a recurrence, 

Ble! — ppl), 
It is usually convenient to start the recurrence with the initial condition B!-"(z) = 0. 
(This discussion is related to semantics of recursion, p. 31.) 
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Consider for instance general plane trees defined by 

_ z 

~ 1-G(z) 

Define the height of a tree as the number of edges on its longest branch. Then the set 
of trees of height < h satisfies the recurrence 


gl =N, gl — Nx SEQ(G!"), 
Accordingly, the OGF of trees of bounded height satisfies 


G=N x SEQ(G) sothat  G(z) 


[-ll/5) = (l,) — [A+1](,) — __ *% 
Gi" (z) =0, GS!(z) =z, G (z) = 1— Glia)’ 
The recurrence unwinds and one finds 
z 
(75) GHz) = = 
an : = 
1l-z 


where the number of stages in the fraction equals b. This is the finite form (tech- 
nically known as a “convergent”) of a continued fraction expansion. From implied 
linear recurrences and an analysis based on Mellin transforms, de Bruijn, Knuth, and 
Rice [113] have determined the average height of a general plane tree to be ~ \/7n. 
We provide a proof of this fact in Chapter V dedicated to applications of rational and 
meromorphic asymptotics. 

For plane binary trees defined by 

B=Z+BxB  sothat B(z)=z+(B(z))’, 
(size is the number of external nodes), the recurrence is 
BM (z) =z, BPA (z) = 2+ (BA\(2))?. 
In this case, the Bl”) are the approximants to a “continuous quadratic form”, namely 
BM(z) =2+(2+(e+(--PPy. 

These are polynomials of degree 2” for which no closed form expression is known, 
nor even likely to exist®. However, using complex asymptotic methods and singularity 
analysis, Flajolet and Odlyzko [197] have shown that the average height of a binary 
plane tree is ~ 2,/7n. 

For Cayley trees, finally, the defining equation is 

T=ZxSeEt(T) sothat T(z) = zeT®), 

The EGF of trees of bounded height satisfy the recurrence 


PG yaa crv et), 


These polynomials are exactly the much studied Mandelbrot polynomials whose behaviour in the 
complex plane gives rise to extraordinary graphics. 
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We are now confronted with a “continuous exponential”, 


ze” 


THl(z) = xe 
The average height was found by Rényi and Szekeres who appealed again to complex 
asymptotics and found it to be ~ V27n. 
These examples show that height statistics are closely related to iteration theory. 
Except in a few cases like general plane trees, normally no algebra is available and 
one has to resort to complex analytic methods as exposed in forthcoming chapters. 


Ill. 8.3. Averages and moments. For extremal parameters, the GF of mean val- 
ues obey a general pattern. Let F be some combinatorial class with GF f(z). Consider 
for instance an extremal parameter x such that f [h](z) is the GF of objects with x- 
parameter at most h. The GF of objects for which x = h exactly is equal to 


fas NS: 
Thus differencing gives access to the probability distribution of height over F. The 
generating function of cumulated values (providing mean values after normalization) 


is then 
Bz) = Sr [ sie) — fhe] 
A=0 
= DS [fey- 1], 
h=0 
as is readily checked by rearranging the second sum, or equivalently using summation 
by parts. 


For maximum component size, the formule involve truncated Taylor series. For 
height, analysis involves in all generality the differences between the fixed point of a 
functional © (the GF f(z)) and the approximations to the fixed point (f!"1(z)) pro- 
vided by iteration. This is a common scheme in extremal statistics. 
> I.41. Hierarchical partitions. Let e(z) = e* — 1. The generating function 

e(e(--- (E(z)))) (h times). 
can be interpreted as the EGF of certain hierarchical partitions. (Such structures show up in 
statistical classification theory [475, 476].) <q 
> III.42. Balanced trees. Balanced structures lead to counting GFs close to the ones obtained 


for height statistics. The OGF of balanced 2-3 trees of height h counted by the number of leaves 
satisfies the recurrence 


gihtt] (z) = Au (2? i 2°) = (Zl (z))? 4 (Z"")(z))3, 


which can be expressed in terms of the iterates of o(z) = z* + z°. Itis also possible to express 
the OGF of cumulated values of the number of internal nodes in such trees. dq 


> IIl.43. Extremal statistics in random mappings. One can express the EGFs relative to the 
largest cycle, longest branch, and diameter of functional graphs. Similarly for the largest tree, 
largest component. [Hint: see [198] for details. ] <i 


> IIl.44. Deep nodes in trees. The BGF giving the number of nodes at maximal depth in 
a general plane tree or a Cayley tree can be expressed in terms of a continued fraction or a 
continuous exponential. dq 
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III. 9. Perspective 


The message of this chapter is that we can use the symbolic method not just to 
count combinatorial objects but also to quantify their properties. The relative ease 
with which we are able to do so is testimony to the power of the method as major 
organizing principle of analytic combinatorics. 

The global framework of the symbolic method leads us to a natural structural cat- 
egorization of parameters of combinatorial objects. First, the concept of inherited pa- 
rameters permits a direct extension of the already seen formal translation mechanisms 
from combinatorial structures to GFs, for both labelled and unlabelled objects—this 
leads to MGFs useful for solving a broad variety of classical combinatorial problems. 
Second, the adaptation of the theory to recursive parameters provides information 
about trees and similar structures, this even in the absence of explicit representations 
of the associated MGFs. Third, extremal parameters which are defined by a maximum 
rule (rather than an additive rule) can be studied by analysing families of univariate 
GFs. Yet another illustration of the power of the symbolic method is found in the 
notion of complete GFs, which in particular enable us to study Bernoulli trials and 
branching processes. 

As we shall see starting with Chapter IV, these approaches become especially 
powerful since they serve as the basis for the asymptotic analysis of properties of 
structures. Not only does the symbolic method provide precise information about 
particular parameters, but also it paves the way for the discovery of general theorems 
that tell us what to expect about a broad variety of combinatorial types. 


Multivariate generating functions are a common tool from classical combinatorial analy- 
sis. Comtet’s book [98] is once more an excellent source of examples. A systematization of 
multivariate generating functions for inherited parameters is given in the book by Goulden and 
Jackson [244]. 

In contrast generating functions for cumulated values of parameters (related to averages) 
seemed to have received relatively little attention until the advent of digital computers and 
the analysis of algorithms. Many important techniques are implicit in Knuth’s treatises, es- 
pecially [306, 307]. Wilf discusses related issues in his book [496] and the paper [494]. 
Early systems specialized to tree algorithms have been proposed by Flajolet and Steyaert in 
the 1980s [169, 213, 214, 455]; see also Berstel and Reutenauer’s work [44]. Some of the 
ideas developed there initially drew their inspiration from the well established treatment of 
formal power series in noncommutative indeterminates; see the books by Eilenberg [149] and 
Salomaa-—Soittola [423] as well as the proceedings edited by Berstel [45]. Several computations 
in this area can nowadays even be automated with the help of computer algebra systems, as 
shown by Flajolet, Salvy, and Zimmermann [206, 424, 515]. 
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Complex Analysis, Rational and 
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The shortest path between two truths in the real domain 
passes through the complex domain. 
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Generating functions are a central concept of combinatorial theory. In Part A, 
we have treated them as formal objects, that is, as formal power series. Indeed, the 
major theme of Chapters I-III has been to demonstrate how the algebraic structure 
of generating functions directly reflects the structure of combinatorial classes. From 
now on, we examine generating functions in the light of analysis. This point of view 
involves assigning values to the variables that appear in generating functions. 

Comparatively little benefit results from assigning only real values to the vari- 
able z that figures in a univariate generating function. In contrast, assigning complex 
values turns out to have serendipitous consequences. When we do so, a generating 
function becomes a geometric transformation of the complex plane. This transforma- 
tion is very regular near the origin—one says that it is analytic (or holomorphic). In 
other words, near 0, it only effects a smooth distortion of the complex plane. Farther 
away from the origin, some cracks start appearing in the picture. These cracks—the 
dignified name is singularities—correspond to the disappearance of smoothness. It 
turns out that a function’s singularities provide a wealth of information regarding the 
function’s coefficients, and especially their asymptotic rate of growth. Adopting a 
geometric point of view for generating functions has a large pay-off. 

By focussing on singularities, analytic combinatorics treads in the steps of many 
respectable older areas of mathematics. For instance, Euler recognized that the fact 
for the Riemann zeta function ¢(s) to become infinite at 1 implies the existence of 
infinitely many prime numbers, while Riemann, Hadamard, and de la Vallée-Poussin 
uncovered deeper connections between quantitative properties of prime numbers and 
singularities of 1/¢(s). 


1 Quoted in The Mathematical Intelligencer, v. 13, no. 1, Winter 1991. 
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The purpose of this chapter is largely to serve as an accessible introduction or a 
refresher of basic notions regarding analytic functions. We start by recalling the el- 
ementary theory of functions and their singularities in a style tuned to the needs of 
analytic combinatorics. Cauchy’s integral formula expresses coefficients of analytic 
functions as contour integrals. Suitable uses of Cauchy’s integral formula then make 
it possible to estimate such coefficients by suitably selecting an appropriate contour 
of integration. For the common case of functions that have singularities at a finite 
distance, the exponential growth formula relates the /ocation of the singularities clos- 
est to the origin—these are also known as dominant singularities—to the exponential 
order of growth of coefficients. The nature of these singularities then dictates the fine 
structure of the asymptotics of the function’s coefficients, especially the subexponen- 
tial factors involved. 


As regards generating functions, combinatorial enumeration problems can be 
broadly categorized according to a hierarchy of increasing structural complexity. At 
the most basic level, we encounter scattered classes, which are simple enough, so that 
the associated generating function and coefficients can be made explicit. (Examples of 
Part A include binary and general plane trees, Cayley trees, derangements, mappings, 
and set partitions). In that case, elementary real-analysis techniques usually suffice 
to estimate asymptotically counting sequences. At the next, intermediate, level, the 
generating function is still explicit, but its form is such that no simple expression is 
available for coefficients. This is where the theory developed in this and the next chap- 
ters comes into play. It usually suffices to have an expression for a generating function, 
but not necessarily its coefficients, so as to be able to deduce precise asymptotic es- 
timates of its coefficients. (Surjections, generalized derangements, unary-binary trees 
are easily subjected to this method. A striking example, that of trains, is detailed in 
Section IV. 4.) Properties of analytic functions then make this analysis depend only on 
local properties of the generating function at a few points, its dominant singularities. 
The third, highest, level, within the perspective of analytic combinatorics, comprises 
generating functions that can no longer be made explicit, but are only determined by a 
functional equation. This covers structures defined recursively or implicitly by means 
of the basic constructors of Part A. The analytic approach even applies to a large 
number of such cases. (Examples include simple families of trees, balanced trees, 
and the enumeration of certain molecules treated at the end of this chapter. Another 
characteristic example is that of nonplane unlabelled trees treated in Chapter VII.) 

As we are going to see in this chapter and the next four ones, the analytic method- 
ology applies to almost all the combinatorial classes studied in Part A, which are pro- 
vided by the symbolic method. In the present chapter we carry out this programme 
for rational functions and meromorphic functions, where the latter are defined by the 
fact their singularities are simply poles. 


IV.1. Generating functions as analytic objects 


Generating functions, considered in Part A as purely formal objects subject to 
algebraic operations, are now going to be interpreted as analytic objects. In so doing 
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FIGURE IV.1. Left: the graph of the Catalan OGF, f(z), for z € (—F, +); right: the 
graph of the derangement EGF, g(z), for z € (—1,+1). 


one gains an easy access to the asymptotic form of their coefficients. This informal 
section offers a glimpse of themes that form the basis of Chapters [V—VII. 

In order to introduce the subject softly, let us start with two simple generating 
functions, one, f(z), being the OGF of the Catalan numbers (cf G(z), p. 33), the 
other, g(z), being the EGF of derangements (cf D‘)(z), p. 113): 


1 exp(—z 
(1) f2)=5 (l-v1—42), g(z) = SPC). 
At this stage, the forms above are merely compact descriptions of formal power series 
built from the elementary series 


1 1 


(l—y) > = ltyt+y?+-, (1—y)/? = L— syn sy? ae, 
a oe 
expy) = ltyytay te, 


by standard composition rules. Accordingly, the coefficients of both GFs are known 
in explicit form 

> 1 /2n —2 ; ‘ oe ae! il 1 (-1)” 
f= letey=2 (PE). m= be) =(G- gr go + SF). 


Stirling’s formula and comparison with the alternating series giving exp(—1) provide 


respectively 
qn sts 

(2) fn aie ant’ GS Ae = 0.36787. 

Our purpose now is to provide intuition on how such approximations could be 
derived without a recourse to explicit forms. We thus examine, heuristically for the 
moment, the direct relationship between the asymptotic forms (2) and the structure of 
the corresponding generating functions in (1). 

Granted the growth estimates available for f,, and gn, it is legitimate to substitute 
in the power series expansions of the GFs f(z) and g(z) any real or complex value 
of a small enough modulus, the upper bounds on modulus being pr = + (for f) and 
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FIGURE IV.2. The images of regular grids by f(z) (left) and g(z) (right). 


Pg = 1 (for g). Figure 1 represents the graph of the resulting functions when such 
real values are assigned to z. The graphs are smooth, representing functions that are 
differentiable any number of times for z interior to the interval (—p, +). However, at 
the right boundary point, smoothness stops: g(z) become infinite at z = 1, and so it 
even ceases to be finitely defined; f(z) does tend to the limit $ as z + (4)~, but its 
derivative becomes infinite there. Such special points at which smoothness stops are 
called singularities, a term that will acquire a precise meaning in the next sections. 

Observe also that, in spite of the series expressions being divergent outside the 
specified intervals, the functions f(z) and g(z) can be continued in certain regions: it 
suffices to make use of the global expressions of Equation (1), with exp and af being 
assigned their usual real-analytic interpretation. For instance: 


1 2 
f(-1)=5(1-v5), 9-2) =. 
Such continuation properties, most notably to the complex realm, will prove essential 


in developing efficient methods for coefficient asymptotics. 


One may proceed similarly with complex numbers, starting with numbers whose 
modulus is less than the radius of convergence of the series defining the GF. Figure 2 
displays the images of regular grids by f and g, as given by (1). This illustrates the fact 
that a regular grid transforms into an orthogonal network of curves and more precisely 
that f and g preserve angles—this property corresponds to complex differentiability 
and is equivalent to analyticity to be introduced shortly. The singularity of f is clearly 
perceptible on the right of its diagram, since, at z = + (corresponding to f(z) = 4), 
the function f folds lines and divides angles by a factor of 2. 

Let us now turn to coefficient asymptotics. As is expressed by (2), the coefficients 
fn and gy, each belong to a general asymptotic type for coefficients of a function F’, 


IV. 1. GENERATING FUNCTIONS AS ANALYTIC OBJECTS 215 


namely, 
[2"|F(z) = A" O(n), 


corresponding to an exponential growth factor A” modulated by a tame factor 0(n), 
which is subexponential. Here, one has A = 4 for f, and A = 1 for g,; also, 
O(n) ~ +(van3)~t for fy and O(n) ~ e7! for gn. Clearly, A should be related to the 
radius of convergence of the series. We shall see that invariably, for combinatorial gen- 
erating functions, the exponential rate of growth is given by A = 1/p, where p is the 
first singularity encountered along the positive real axis (Theorem IV.6). In addition, 
under general complex-analytic conditions, it will be established that O(n) = O(1) is 
systematically associated to a simple pole of the generating function (Theorem IV. 10, 
p. 245), while 6(n) = O(n—*/2) systematically arises from a singularity that is of the 
square-root type (Chapters VI and VII). In summary, as this chapter and the next ones 
will copiously illustrate, the coefficient formula 


(3) [2"| F(z) = A" O(n), 


with its exponentially dominating term and its subexponential factor, is central. We 
have: 


First Principle of Coefficient Asymptotics. The location of a function’s 
singularities dictates the exponential growth (A”) of its coefficients. 
Second Principle of Coefficient Asymptotics. The nature of the function’s 
singularities determines the associate subexponential factor (0(n)). 


Observe that the rescaling rule, 
[2"|F(z) =p "[2"|F (pz), 


enables one to normalize functions so that they are singular at 1. Then various the- 
orems, starting with Theorems IV.9 and IV.10, provide sufficient conditions under 
which the following central implication is valid, 


(4) hz) ~o(z) => [2"IA(z) ~ [e"Jo(2). 


There h(z), whose coefficients are to be estimated, is a function singular at 1 and o(z) 
is a local approximation near the singularity; usually o is a much simpler function, 
typically like (1 — z)“log?(1 — z) whose coefficients are comparatively easy to esti- 
mate (Chapter VI). The relation (4) expresses a mapping between asymptotic scales 
of functions near singularities and asymptotics scales of coefficients. Under suitable 
conditions, it then suffices to estimate a function locally at a few distinguished points 
(singularities), in order to estimate its coefficients asymptotically. 

> IV.1. Euler, the discrete, and the continuous. Eulers’s proof of the existence of infinitely 


many prime numbers illustrates in a striking manner the way analysis of generating functions 
can inform us on the discrete realm. Define, for real s > 1 the function 


216 IV. COMPLEX ANALYSIS, RATIONAL AND MEROMORPHIC ASYMPTOTICS 


known as the Riemann zeta function. The decomposition (p ranges over the prime numbers 
2,3,5,...) 


i 3 “a ee! ‘oe 
o- (+ pete) (Gt pt gee) (4st ete 


(5) W\72 
=, ) 


expresses precisely the fact that each integer has a unique decomposition as a product of primes. 
Analytically, the identity (5) is easily checked to be valid for all s > 1. Now suppose that there 
were only finitely many primes. Let s tend to 1+ in (5). Then, the left hand side becomes 
infinite, while the right hand side tends to the finite limit [],,(1 — 1/ ‘p)~ +: a contradiction has 
been reached. <J 


> IV.2. Elementary transfers. Elementary series manipulation yield the following general re- 
sult: Let h(z) be a power series with radius of convergence > 1 and assume that h(1) 4 0; 
then one has 


ny {Z) n hd) n 1 Ad) 
wh), BWI 26 -, aaa) og pe ~ 
See Bender’s survey [29] for many similar statements. dq 


> IV.3. Asymptotics of generalized derangements. The EGF of permutations without cycles of 
length 1 and 2 satisfies (p. 113) 


—2z—27/2 e7 3/2 
j(2) =——— with i) 


Land : 
zoll—z 


Analogy with derangements suggests that [z"]j(z) 3/2 [For a proof, use Note 2 or 


~ Ee 
n—-oco 


refer to Example 8.] Here is a table of exact values of [z”]j(z) (with relative error of the 


approximation by e~°/? in parentheses): 
n=5 n= 10 n = 20 n = 50 
jn: 0.2 0.22317 0.2231301600 0.2231301601484298289332804707640122 
error: (1071) (2-1074) (3- 10719) (10-33) 
The quality of the asymptotic approximation is extremely good, such a property being invariably 
attached to polar singularities. dq 


IV.2. Analytic functions and meromorphic functions 


Analytic functions are a primary mathematical concept of asymptotic theory. They 
can be characterized in two essentially equivalent ways (see IV.2.1): by means of 
convergent series expansions (a la Cauchy and Weierstraf8) and by differentiability 
properties (a la Riemann). The first aspect is directly related to the use of generating 
functions for enumeration; the second one allows for a powerful abstract discussion 
of closure properties that usually requires little computation. 

Integral calculus with analytic functions (see IV. 2.2) assumes a shape radically 
different from what it is in the real domain: integrals become quintessentially inde- 
pendent of details of the integration contour—certainly the prime example of this fact 
is Cauchy’s famous residue theorem. Conceptually, this independence makes it pos- 
sible to relate properties of a function at a point (e.g., the coefficients of its expansion 
at 0) to its properties at another far-away point (e.g., its residue at a pole). 
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The presentation in this section and the next one constitutes an informal review? 
of basic properties of analytic functions tuned to the needs of asymptotic analysis of 
counting sequences. The entry in APPENDIX B: Equivalent definitions of analyticity, 
p. 687 provides further information, in particular a proof of the Basic Equivalence 
Theorem, Theorem IV.1 below. For a detailed treatment, we refer the reader to one 
of the many excellent treatises on the subject, like the books by Dieudonné [129], 
Henrici [265], Hille [269], Knopp [299], Titchmarsh [469], or Whittaker and Wat- 
son [492]. 


IV. 2.1. Basics. We shall consider functions defined in certain regions of the 
complex domain C. By a region is meant an open subset 2 of the complex plane 
that is connected. Here are some examples: 


a. Z y ¢ >. 
v7 v pales 
q —— S t wi t , aN 4 
! 1 
! q { 4 { q 4 
/ ” ‘ 1 
~~ 1 \ ~ x \ Y a a 7 
,  — a 7 " ae ’ 
y -™ SS a Sw a 
Xe cas a 
simply connected domain slit complex plane indented disc annulus 


Classical treatises teach us how to extend to the complex domain the standard 
functions of real analysis: polynomials are immediately extended as soon as complex 
addition and multiplication have been defined, while the exponential is definable by 
means of Euler’s formula. One has for instance 


2 = (a? —y") + ivy, e* =e" cosy + ie’ siny, 
if z = x + iy, that is, 2 = R(z) and y = S(z) are the real and imaginary parts of z. 
Both functions are consequently defined over the whole complex plane C. 

The square-root and the logarithm are conveniently described in polar coordinates 
by 


(6) Vz = Jpe"l?, — logz = logp + 18, 


if z = pe’®. One can take the domain of validity of (6) to be the complex plane slit 
along the axis from 0 to —oo, that is, restrict 6 to the open interval (—7, +7), in which 
case the definitions above specify what is known as the principal determination. There 
is no way for instance to extend by continuity the definition of ,/z in any domain 
containing 0 in its interior since, fora > 0 and z — —a, one has \/z — i\/a as 
z — —a from above, while \/z — —i,/a as z — —a from below. This situation is 
depicted here: 


2The reader previously unfamilar with the theory of analytic functions should essentially be able to 
adopt Theorems IV.1 and IV.2 as “axioms” and start from there using basic definitions and a fair knowledge 
of elementary calculus. Figure 18 at the end of this chapter (p. 273) recapitulates the main results of 
relevance to Analytic Combinatorics. 
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The values of ./z 
as z varies along |z| = a. 


The point z = 0 where several determinations “meet” is accordingly known as a 
branch point. 


Analytic functions. First comes the main notion of an analytic function that 
arises from convergent series expansions and is closely related to the notion of gener- 
ating function encountered in previous chapters. 

Definition IV.1. A function f(z) defined over a region Q is analytic at a point z) € Q 
if, for z in some open disc centred at zo and contained in Q., it is representable by a 
convergent power series expansion 


(7) f(e%= S- Cn(z — Zo)”. 
n>0 
A function is analytic in a region Q. iff it is analytic at every point of Q. 

As derived from an elementary property of power series, given a function f that is 
analytic at a point zo, there exists a disc (of possibly infinite radius) with the property 
that the series representing f(z) is convergent for z inside the disc and divergent for z 
outside the disc. The disc is called the disc of convergence and its radius is the radius 
of convergence of f(z) at z = zo, which will be denoted by Reony(f; 20). Quite ele- 
mentarily, the radius of convergence of a power series conveys information regarding 
the rate at which its coefficients grow; see Subsection IV. 3.2 below for developments. 
It is also easy to prove by simple series rearrangement (see APPENDIX B: Equivalent 
definitions of analyticity, p. 687) that if a function is analytic at Zp, it is then analytic 
at all points interior to its disc of convergence. 

Consider for instance the function f(z) = 1/(1 — z) defined over C \ {1} in the 
usual way via complex division. It is analytic at 0 by virtue of the geometric series 


sum, 
1 
——— 1.2” 
1l-z dX 7 
which converges in the disc |z| < 1. Ata point z9 4 1, we may write 
1 1 coh an al 1 
l1-z 1—xz—(z-—2Z%) Lge ho 
(8) 1 n+1 
= D(5) 6-9" 
— 2% 


n>0 
The last equation shows that f(z) is analytic in the disc centred at zp with radius 


|1 — zo|, that is, the interior of the circle centred at zo and passing through the point 1. 
In particular Reony(f, 20) = |1 — 2o| and f(z) is globally analytic in the punctured 


plane C \ {1}. 
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The last example illustrates the definition of analyticity. However, the series re- 
arrangement approach that it uses might be difficult to carry out for more complicated 
functions. In other words, a more manageable approach to analyticity is called for. 
The differentiability properties developed next provide such an approach. 

Differentiable (holomorphic) functions. The next important notion is a geomet- 
ric one based on differentiability. 

Definition IV.2. A function f(z) defined over a region Q is called complex-differen- 
tiable (also holomorphic) at zo if the limit, for complex 6, 

jam £6204) = f(%) 

6-0 6 
exists. (In particular, the limit is independent of the way 6 tends to 0 in C.) This limit 
is denoted as usual by f'(zo) or #f(z)|,,: A function is complex-differentiable in Q 
iff it is complex-differentiable at every zg € Q. 

Clearly, if f(z) is complex-differentiable at zo and f’(zo) 4 0, it acts locally as a 
linear transformation: 


f(z) — f(zo) ~ f'(20)(z — 20) (z — 29). 
Then f(z) behaves in small regions almost like a similarity transformation (composed 
of a translation, a rotation, and a scaling). In particular, it preserves angles* and infin- 
itesimal squares get transformed into infinitesimal squares; see Figure 3 for a render- 
ing. 
For instance the function \/z, defined by (6) in the complex plane slit along the 
ray (—oo, 0), is complex-differentiable at any z of the slit plane since 


Vv J/1+6/z-1 1 
(9) fine Oe ve sae py es oc ara °/ g 
6-0 =e ~ O/z’ 
which extends the customary proof of real analysis. Similarly, \/1 — z is analytic in 
the complex plane slit along the ray (1, +00). More generally, the usual proofs from 
real analysis carry over almost verbatim to the complex realm, to the effect that 


f 


The notion of complex differentiability is thus much more manageable than the notion 
of analyticity. 

It follows from a well known theorem of Riemann (see for instance [265, vol. 1, 
p 143] and APPENDIX B: Equivalent definitions of analyticity, p. 687) that analyticity 
and complex differentiability are equivalent notions. 


(ft+o=ft+a', (fo) =fo+fs’, (5) = -5. (fog)! =(fiog)g’. 


Theorem IV.1 (Basic Equivalence Theorem). A function is analytic in a region Q. if 
and only if it is complex-differentiable in Q. 

The following are known facts (see again Appendix B): if a function is analytic 
(equivalently complex-differentiable) in 2, it admits (complex) derivatives of any or- 
der there. This property markedly differs from real analysis: complex differentiable 


3A mapping of the plane that locally preserves angles is also called a conformal map. 
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DO  D © 


bdswonsoe 


a 


2-2 


—2-2 


FIGURE IV.3. Multiple views of an analytic function. The image of the domain Q = 
{z | |R(z)| < 2,|S(z)| < 2} by f(z) = exp(z) + z + 2: [top] transformation of a 
square grid in Q by f; [middle] the modulus and argument of f(z); [bottom] the real and 
imaginary parts of f(z). 


(equivalently, analytic) functions are all smooth. Also derivatives of a function are 
obtained through term-by-term differentiation of the series representation of the func- 
tion. 

Meromorphic functions. We finally introduce meromorphic‘ functions that are 
mild extensions of the concept of analyticity (or holomorphy) and are essential to the 
theory. 

The quotient of two analytic functions f(z)/g(z) ceases to be analytic at a point 
a where g(a) = 0. However, a simple structure for quotients of analytic functions 
prevails. 


Definition IV.3. A function h(z) is meromorphic at 2p iff, for z in a neighbourhood of 
20 with z 4 Zo, it can be represented as f(z)/g(z), with f(z) and g(z) being analytic 


4Holomorphic” and “meromorphic” are words coming from Greek, meaning respectively “of com- 
plete form” and “of partial form”. 
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at zo. In that case, it admits near zo an expansion of the form 
(10) h(z)= S5 hnlz— 20)”. 
n>—M 


Ifh_um # Oand M > 1, then h(z) is said to have a pole of order M at z = zg. The 
coefficient h_, is called the residue of h(z) at z = zo and is written as 


Res[h(z); z = Zo]. 


A function is meromorphic in a region iff it is meromorphic at any point of the region. 


IV. 2.2. Integrals and residues. A path in a region 2) is described by its param- 
eterization, which is a continuous function y mapping [0, 1] into 2. Two paths +, 7’ 
in 2 having the same end points are said to be homotopic (in Q) if one can be contin- 
uously deformed into the other while staying within (2 as in the following examples: 


homotopic paths: 


A closed path? is defined by the fact that its end points coincide: y(0) = y(1), and 
a path is simple if the mapping 7 is one-to-one. A closed path is said to be a loop of 
Q. if it can be continuously deformed within Q to a single point; in this case one also 
says that the path is homotopic to 0. In what follows we implicitly restrict attention to 
paths that are assumed to be rectifiable. Unless otherwise stated, all integration paths 
will be assumed to be oriented positively. 

Integrals along curves in the complex plane are defined in the usual way as curvi- 
linear integrals of complex-valued functions. Explicitly: let f(a + iy) be a function 
and ¥ be a path; then, 


/ fl2)de = | f(r(t))'(b) at 
es [ ac-soj asi f’ [4p +20) a, 
0 0 


where f = A+iBandy' = C'+iD. However integral calculus in the complex plane 
is of a radically different nature from what it is on the real line—in a way it is much 
simpler and much more powerful. One has: 


Theorem IV.2 (Null Integral Property). Let f be analytic in Q and let X be a simple 
loop of Q. Then il f =0. 


Spy default, paths used in this book are assumed to be positively oriented piecewise continuously 
differentiable (hence rectifiable); in addition, closed paths are assumed to be positively oriented. 
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Equivalently, integrals are largely independent of details of contours: for f analytic 
in 2, one has 


a) [r-[s 


provided y and 7’ are homotopic (not necessarily closed) paths in 2. A proof of The- 
orem IV.2 is sketched in APPENDIX B: Equivalent definitions of analyticity, p. 687. 


Residues. The important Residue Theorem due to Cauchy relates global prop- 
erties of a meromorphic function (its integral along closed curves) to purely local 
characteristics at designated points (the residues at poles). 


Theorem IV.3 (Cauchy’s residue theorem). Let h(z) be meromorphic in the region 
and let \ be a simple loop in Q along which the function is analytic. Then 
1 
din I h(z)dz = d, Res[h(z); z = s], 
where the sum is extended to all poles s of h(z) enclosed by X. 


PROOF. (Sketch) To see it in the representative case where h(z) has only a pole at 
z = 0, observe by appealing to primitive functions that 


gntl dz 
h d = hn he ere, 
i, ie 2 = i}, ie rz 


where the bracket notation [u(z)] , designates the variation of the function u(z) along 
the contour \. This expression reduces to its last term, itself equal to 2i7h_1, as is 
checked by using integration along a circle (set z = re’). The computation extends 
by translation to the case of a unique pole at z = a. 

In the case of multiple poles, we observe that the simple loop can only enclose 
finitely many poles (by compactness). The proof then follows from a simple decom- 
position of the interior domain of \ into cells each containing only one pole. Here is 
an illustration in the case of three poles. 


Cay 


(Contributions from internal edges cancel.) 


Global (integral) to local (residues) connections. Here is a textbook example of 
a reduction from global to local properties of analytic functions. Define the integrals 


Im = [. =a 
ae 1+ 42” 
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and consider specifically [;. Elementary calculus teaches us that J; = 7 since the 
antiderivative of the integrand is an arc tangent: 


co 
d 
i =| pea ae [arctan 2]7%° =n. 
oo L + x? ee 


Here is an alternative, and in many ways more fruitful, derivation. In the light 
of the residue theorem, we consider the integral over the whole line as the limit of 
integrals over large intervals of the form [—R,+ A], then complete the contour of 
integration by means of a large semi-circle in the upper half-plane, as shown below: 


—-R 0 +R 
Let y be the contour comprised of the interval and the semi-circle. Inside y, the 
integrand has a pole at x = 2, where 


1 1 a -"d 


l+a2 (e+i(e—-i) 2z2-i ’ 

so that its residue there is —i/2. By the residue theorem, the integral taken over ¥ is 
equal to 277 times the residue of the integrand at 7. As R — ov, the integral along 
the semi-circle vanishes (it is less than tR/(1 + R?) in modulus), while the integral 


along the real segment gives J, in the limit. There results the relation giving [;: 


. 1 ; i 
J, = 2in Res (a = i = (27) (-5) =. 


The evaluation of the integral in the framework of complex analysis rests solely 
upon the local expansion of the integrand at special points (here, the point 2). This is a 
remarkable feature of the theory, one that confers it much simplicity, when compared 
to real analysis. 


> IVA. The general integral Im. Let a = exp(4) so that a””” = —1. Contour integration 
of the type used for J; yields 


Ty es 2im 5” Res (te — a) ; 
2m 


j=l 


while, for any 8 = a~! with 1 < j < m, one has 


= 
= 
= 
ev) 
= 


Asa consequence, 


ee (ato? +---+a7"") = 
m 


In particular, 2 = w//2, I3 = 27/3, Ia = EV/2/2 + s/2, and Is, =[¢ are expressible by 
radicals, but 2Ir, = Io are not. The special cases Aly, - Tos7 are expressible by radicals. < 
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> IV.5. Integrals of rational fractions. Generally, all integrals of rational functions taken over 
the whole real line are computable by residues. In particular, 


I= ee dz ae ‘a dz 
ee thie (1+ a2)™’ oe oo (12 + x?) (22 + x?) --- (m2 + a?) 
can be explicitly evaluated. <q 
Cauchy’s coefficient formula. Many function-theoretic consequences derive from 


the residue theorem. For instance, if f is analytic in Q, zo € Q and A is a simple loop 
of 2 encircling zo, one has 


(12) Che [ ro” 


(Qin C290 


This follows directly since 


Res [f(¢)/(¢ — 20); ¢ = 20] = f (20). 
Then, by differentiation with respect to zo under the integral sign, one gets similarly 


d 
(13) a ®) (zo) = a | FO ee 


The values of a function and its derivatives at a point can thus be obtained as values of 
integrals of the function away from that point. The world of analytic functions is a very 
gentle one in which to live: contrary to real analysis, a function is differentiable any 
number of times as soon as it is differentiable once. Also, Taylor’s formula invariably 
holds: as soon as f(z) is analytic at zo, one has 


(14) f(z) = f(z) + f'(20)(z — 20) + af" (20)(2 ean eee: 


with the representation being convergent in a small disc centred at zo. [Proof: a veri- 
fication from (12) and (13), or a series rearrangement as in (B.7), p. 688.] 


A very important application of the residue theorem concerns coefficients of ana- 
lytic functions. 
Theorem IV.4 (Cauchy’s Coefficient Formula). Let f(z) be analytic in a region con- 
taining 0 and let X be a simple loop around 0 that is positively oriented. Then the 
coefficient [z”] f (z) admits the integral representation 


f= k= [1 


PROOF. This formula follows directly from the equalities 


1 dz 

in - ei (2 ) gntl 

of which the first follows from the residue theorem, and the second from the identifi- 
cation of the residue at 0 as a coefficient. 
Analytically, the coefficient formula allows one to deduce information about the 
coefficients from the values of the function itself, using adequately chosen contours of 
integration. It thus opens the possibility of estimating the coefficients [z”] f(z) in the 
expansion of f(z) near 0 by using information on f(z) away from 0. The rest of this 
chapter will precisely illustrate this process in the case of rational and meromorphic 


dz 
+1 


gn 


= Res [f(z = 0) = ("2 
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functions. Observe also that the residue theorem provides the simplest known proof 
of the Lagrange inversion theorem (see APPENDIX A: Lagrange Inversion, p. 677) 
whose réle is central to tree enumerations, as we saw in Chapters I and II. The notes 
below explore some independent consequences of the residue theorem and the coeffi- 
cient formula. 

> IV.6. Liouville’s Theorem. If a function f(z) is analytic in the whole of C and is of modulus 
bounded by an absolute constant, | f(z)| < B, then it must be a constant. [By trivial bounds, 
upon integrating on a large circle, it is found that the Taylor coefficients at the origin of index 
> Lare all equal to 0.] Similarly, if f(z) is of at most polynomial growth, | f(z)| < B(|z|+1)” 
over the whole of C, then it must be a polynomial. dq 


> IV.7. Lindeléf integrals. Let a(s) be analytic in R(s) > 4 where it is assumed to satisfy 
a(s) = O(exp((a — 6)|s|)) for some 6 with 0 < 6 < a. Then, one has for | arg(z)| < 6, 


oo 1 [i/2+tice 
a(k)(—z)* = — ii a(s)z° — ds, 
1 


20m Jy /2-ic6 sin 7s 


k=1 


in the sense that the integral exists and provides the analytic continuation of the sum in | arg(z)| < 
6. [Close the integration contour by a large semi-circle on the right and evaluate by residues. ] 
Such integrals, sometimes called Lindel6f integrals, provide representations for many functions 
whose Taylor coefficients are given by an explicit rule [220, 333]. dq 


> IV.8. Continuation of polylogarithms. As a consequence of Lindel6f’s representation, the 
generalized polylogarithm functions, 


Lig,«(z) = De n- “(log n)*z” (a@E€R, kE€Zso), 
n>1 


are analytic in the complex plane C slit along (1+,00). (More properties are presented in 
Section VI. 8; see also [176, 220].) For instance, one obtains in this way 


Se eel eae) m 


n=1 =o 
when the divergent series on the left is interpreted as Lio,;(—1) = lim,_,_ 4+ Lio,1(z). dq 


> IV.9. Magic duality. Let ¢ be a function initially defined over the nonnegative integers but 
admitting a meromorphic extension over the whole of C. Under growth conditions in the style 


of Note 7, the function 
F(z) := 5° o(n)\(-2)", 
n>1 


which is analytic at the origin, is such that, near positive infinity, 


F(2)_~ El) - So o(-ny-2), 


n>1 


for some elementary function E(z). [Starting from the representation of Note 7, close the 
contour of integration by a large semicircle to the left.] In such cases, the function is said to 
satisfy the principle of magic duality—its expansion at 0 and oo are given by one and the same 
rule. Functions 


ara) log(1+ 2), exp(—2), Li2(—z), Li3(—z), 
1+z 
satisfy a form of magic duality. Ramanujan [42] made a great use of this principle, which 


applies to a wide class of functions including hypergeometric ones; see Hardy’s insightful dis- 
cussion [260, Ch XI]. J 
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> IV.10. Euler-Maclaurin and Abel-Plana summations. Under simple conditions on the an- 
alytic function f, one has Plana’s (also known as Abel’s) complex variables version of the 
Euler—Maclaurin summation formula: 


fn) = 510+ f fla)ae+ [ fl) = Ftv) gy, 


e2iny = 
(See [266, p. 274] for a proof and validity conditions.) <i 


> IV.A1. Norlund-Rice integrals. Let a(z) be analytic for R(z) > ko — 4 and of at most 


polynomial growth in this right half plane. Then, with y a simple loop around the interval 
[ko, n], one has 


eee es 
¥ (Fen oh) =a fo) SoG ES 


k=ko 


If a(z) is meromorphic in a larger region, then the integral can be estimated by residues. For 


instance, with 
— (n\ (-1)* Sv faheD* 
e=()Q n= (ee, 
k=1 k=1 

it is found that S, = —H, (a harmonic number), while T;, oscillates boundedly as n — 
+oo. [This technique is a classical one in the calculus of finite differences, going back to 
Norlund [374]. In computer science it is known as the method of Rice’s integrals [207] and 
is used in the analysis of many algorithms and data structures including digital trees and radix 
sort [307, 458].] J 


IV.3. Singularities and exponential growth of coefficients 


For a given function, a singularity can be informally defined as a point where the 
function ceases to be analytic. (Poles are the very simplest type of singularity.) Singu- 
larities are, as we have stressed repeatedly, essential to coefficient asymptotics. This 
section presents the bases of a discussion within the framework of analytic function 
theory. 


IV.3.1. Singularities. Let f(z) be an analytic function defined over the interior 
region determined by a simple closed curve 7+, and let zg be a point of the bounding 
curve y. If there exists an analytic function f*(z) defined over some open set 0* 
containing zp and such that f*(z) = f(z) in Q* 7 Q, one says that f is analytically 
continuable at zo and that f* is an immediate analytic continuation of f. 


Y 


Analytic continuation: Q f(z) = fle) on FN. 


(f) (f*) 


Consider for instance the quasi-inverse function, f(z) = 1/(1— z). Its power se- 
ries representation f(z) = )°7,,., 2” initially converges in |z| < 1. However, the 
calculation of (8) shows that it is representable locally by a convergent series near 
any point z9 # 1. In particular, it is continuable at any point of the unit disc ex- 
cept 1. (Alternatively, one may appeal to complex-differentiability to verify directly 
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that f(z), which is given by a “global” expression, is holomorphic, hence analytic, in 
the punctured plane C \ {1}.) 

In sharp contrast to real analysis where a function admits of many smooth ex- 
tensions, analytic continuation is essentially unique: if f* Gn (*) and f** (in 0**) 
continue f at zo, then one must have f*(z) = f**(z) in the intersection 0* N **, 
which in particular includes a small disc around zo. Thus, the notion of immediate 
analytic continuation at a boundary point is intrinsic. The process can be iterated and 
we say that g is an analytic continuation® of f along a path y, even if the domains of 
definition of f and g do not overlap, provided a finite chain of intermediate functions 
connects f and g. This notion is once more intrinsic—this is known as the principle of 
unicity of analytic continuation (Rudin [419, Ch. 16] provides a thorough discussion). 
An analytic function is then much like a hologram: as soon as it is specified in any 
tiny region, it is rigidly determined in any wider region where it can be continued. 


Definition IV.4. Given a function f defined in the region interior to the simple closed 
curve ‘y, a point Zo on the boundary (7) of the region is a singular point or a singular- 
ity’ if f is not analytically continuable at zo. 

Granted the intrinsic character of analytic continuation, we can usually dispense with 
a detailed description of the original domain (Q and the curve +. In simple terms, a 
function is singular at zo if it cannot be continued as an analytic function beyond Zp. 
A point at which a function is analytic is also called by contrast a regular point. 

The two functions f(z) = 1/(1—z) and g(z) = V1 — z may be taken as initially 
defined over the open unit disk by their power series representation. Then, as we 
already know, they can be analytically continued to larger regions, the punctured plane 
Q = C\ {1} for f [e.g., by the calculation of (8)] and the complex plane slit along 
(1, +00) for g [e.g., by virtue of differentiability as in (9)]. But both are singular at 1: 
for f, this results from the fact that (say) f(z) — oo as z — 1; for g this is due to the 
branching character of the square-root. Figure 4 displays a few types of singularities 
that are traceable by the way they deform a regular grid near a boundary point. 


It is easy to check from the definitions that a converging power series is analytic 
inside its disc of convergence. In other words, it can have no singularity inside this 
disc. However, it must have at least one singularity on the boundary of the disc, as 
asserted by the theorem below. In addition, a classical theorem, called Pringsheim’s 
theorem, provides a refinement of this property in the case of functions with nonneg- 
ative coefficients, which includes all combinatorial generating functions. 


Theorem IV.5 (Boundary singularities). A function f(z) analytic at the origin, whose 
expansion at the origin has a finite radius of convergence R, necessarily has a singu- 


larity on the boundary of its disc of convergence, |z| = R. 
PROOF. Consider the expansion 
(15) f= D0 fr2”, 


n>0 


6The collection of all function elements continuing a given function gives rise to the notion of Riemann 
surface, for which many good books exist, e.g., [157, 444]. We shall normally avoid appealing to this theory. 
7For a detailed discussion, see [129, p. 229], [299, vol. 1, p. 82], or [469]. 
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FIGURE IV.4. The images of a grid on the unit square (with corners 4 


t+} + 


t 7) by various 


functions singular at z = 1 reflect the nature of the singularities involved. Singularities are 
apparent near the right of each diagram where small grid squares get folded or unfolded in 
various ways. (In the case of functions fo, f1, fa that become infinite at z = 1, the grid 


has been slightly truncated to the right.) 
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assumed to have radius of convergence exactly R. We already know that there can 
be no singularity of f within the disc |z| < R. To prove that there is a singularity 
on |z| = R, suppose a contrario that f(z) is analytic in the disc |z| < p for some p 
satisfying p > R. By Cauchy’s coefficient formula (Theorem IV.4), upon integrating 
along the circle of radius r = (R + p)/2, and by trivial bounds, it is seen that the 
coefficient [z"] f(z) is O(r—”). But then, the series expansion of f would have to 
converge in the disc of radius r > FR, a contradiction. 

Pringsheim’s Theorem stated and proved now is a refinement of Theorem IV.5 that 
applies to all series having nonnegative coefficients, in particular, generating func- 
tions. It is central to asymptotic enumeration as the remainder of this section will 
amply demonstrate. 


Theorem IV.6 (Pringsheim’s Theorem). /f f(z) is representable at the origin by a 
series expansion that has nonnegative coefficients and radius of convergence R, then 
the point z = R is a singularity of f(z). 
> IV.12. Proof of Pringsheim’s Theorem. (See also [469, Sec. 7.21].) In a nutshell, the idea 
of the proof is that if f has positive coefficients and is analytic at R, then its expansion slightly 
to the left of R has positive coefficients. Then the power series of f would converge in a disc 
larger than the postulated disc of convergence—a clear contradiction. 

Suppose a contrario that f(z) is analytic at R, implying that it is analytic in a disc of 
radius r centred at R. We choose a number h such that0 < h < ar and consider the expansion 
of f(z) around z) = R—h: 


(16) f(z) = 5 gm(z- 0)”. 


m>0 


By Taylor’s formula and the representability of f(z) together with its derivatives at zo by means 
of (15), we have 


and in particular, gm > 0. 
Given the way h was chosen, the series (16) converges at z = R+h (so that z — zo = 2h) 
as illustrated by the following diagram: 


Consequently, one has 
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This is a converging double sum of positive terms, so that the sum can be reorganized in any 
way we like. In particular, one has convergence of all the series involved in 


f(R+h) = SO (7) mie m-reny” 


m,n>0 

= Sof [(R-h) + (2a)]” 
n>=0 

= So f(R+h)". 
n=O 


This establishes the fact that f, = o((R + h)”), thereby reaching a contradiction with the 
assumption that the serie representation of f has radius of convergence exactly R. Pringsheim’s 
theorem is proved. <q 

Singularities of a function analytic at 0 which lie on the boundary of the disc of 
convergence are called dominant singularities. Pringsheim’s theorem appreciably 
simplifies the search for dominant singularities of combinatorial generating functions 
since these have nonnegative coefficients—tt is then sufficient to investigate analytic- 
ity along the positive real line and detect the first place at which it ceases to hold. 

For instance, the derangement EGF and the surjection EGF, 


eo: 


1-2’ 
are analytic except for a simple pole at z = 1 in the case of D(z), and except for 
points x, = log 2+ 2¢kz that are simple poles in the case of R(z). Thus the dominant 
singularities for derangements and surjections are at 1 and log 2 respectively. 

It is known that ’Z cannot be unambiguously defined as an analytic function in 
a neighbourhood of Z = 0. As a consequence, the function 


C(z) = (1- v1 — 42)/2, 
which is the generating function of the Catalan numbers, is an analytic function in 
regions that must exclude 1/4; for instance, one may opt to take the complex plane 
slit along the ray (1/4, +00). Similarly, the function 
1 

1-—z 
which is the EGF of cyclic permutations is analytic in the complex plane slit along 
(1, +00). 

A function having no singularity at a finite distance is called entire; its Taylor 
series then converges everywhere in the complex plane. The EGFs, 


D(z) R(z) = (2—e*)! 


L(z) = log 


e*-1 


2 
eters and e ‘ 


associated respectively with involutions and set partitions, are entire. 


IV. 3.2. The Exponential Growth Formula. We say that a number sequence 
{an} is of exponential order K” which we abbreviate as (the symbol >< is a “bowtie’’) 


dn™K" iff — limsup|a,|!/" = K. 


The relation X > Y reads as “X is of exponential order Y”. It expresses both an 
upper bound and a lower bound, and one has, for any € > 0: 
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(i) |an| >i.o (K —€)”, that is to say, |a,| exceeds (KK — €)” infinitely often (for 
infinitely many values of 7); 
(it) |an| <a.e. (A + ©)”, that is to say, |a,,| is dominated by (K + €)” almost 
everywhere (except for possibly finitely many values of 7). 
This relation can be rephrased as a, = K"0(n), where 0 is a subexponential factor 
satisfying 
lim sup |0(n)|!/" = 1; 
such a factor is thus bounded from above almost everywhere by any increasing expo- 


nential (of the form (1+ ¢)”) and bounded from below infinitely often by any decaying 
exponential (of the form (1 — €)”). Typical subexponential factors are 


1 
\/log n 


1, n°, (logn)?, Vn, ,n3/? loglogn. 
(Functions like eV” and exp (log? n) are to be treated as subexponential factors for the 
purpose of this discussion.) The lim sup definition also allows in principle for factors 
that are infinitely often very small or 0, like n? sinn%, log n cos Vn, and so on. In 
this and the next chapters, we shall develop systematic methods that enable one to 
extract such subexponential factors from generating functions. 

It is an elementary observation that the radius of convergence of the series rep- 
resentation of f(z) at 0 is related to the exponential growth rate of the coefficients 
fn = [2”"|f (2). To wit, if Reonv (f; 0) = R, then we claim that 


1 n 
Gh tyr (=) , ie, fr=R-"O(n) with limsup |0(n)|1/" = 1. 


> IV.13. Radius of convergence and exponential growth. This only requires the basic definition 
of a power series. (i) By definition of the radius of convergence, we have for any small « > 0, 
fn(R— ©)” — 0. In particular, | f,|(R—)" < 1 for all sufficiently large n, so that | f,|!/" < 
(R—.«)~* “almost everywhere”. (iz) In the other direction, for any € > 0, |fn|(R-+ 6)” cannot 
be a bounded sequence, since otherwise, >, |fn|(R + €/2)” would be a convergent series. 


Thus, | fn|'/” > (R + )~? “infinitely often”. <q 


A global approach to the determination of growth rates is desirable. This is made 
possible by Theorem IV.5. 


Theorem IV.7 (Exponential Growth Formula). If f(z) is analytic at 0 and R is the 
modulus of a singularity nearest to the origin in the sense that® 


R:=sup{r>0 | fis analytic in|z| <r}, 


then the coefficient fr, = [z"| f(z) satisfies 
im (a) 


80ne should think of the process defining FR as follows: take discs of increasing radii r and stop as 
soon as a singularity is encountered on the boundary. (The dual process that would start from a large disc 
and restrict its radius is in general ill-defined—think of /1 — z.) 
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For functions with nonnegative coefficients, including all combinatorial generating 
functions, one can also adopt 


R= sup {r = 0 | f is analytic at all points of 0 < z < rh 


PROOF. Let R be as stated. We cannot have R < Reony(f;0) since a function is 
analytic everywhere in the interior of the disc of convergence of its series representa- 
tion. We cannot have R > Reonv(f;0) by the Boundary Singularity Theorem. Thus 
R = Reonv(f; 0). The statement then follows from (17). The adaptation to nonnega- 
tive coefficients results from Pringsheim’s theorem. 

The exponential growth formula thus directly relates the exponential order of 
growth of coefficients of a function to the location of its singularities nearest to the 
origin. This is precisely expressed by the First Principle of Coefficient Asymptotics 
(p. 215), which, given its importance, we repeat here: 


First Principle of Coefficient Asymptotics. The location of a function’s 
singularities dictates the exponential growth (A”) of its coefficient. 


Several direct applications to combinatorial enumeration are given below. 


EXAMPLE IV.1. Exponential growth and combinatorial enumeration. Here are a few imme- 
diate applications of exponential bounds. 


Surjections. The function 

R(z) = (2—e*)" 
is the EGF of surjections. The denominator is an entire function, so that singularities may only 
arise from its zeros, to be found at the points 


Xk = log 2+ 2ikn, keEZ. 
The dominant singularity of R is then at p = yo = log 2. Thus, with r, = [z"]R(z), 
y". 


Similarly, if “double” surjections are considered (each value in the range of the surjection 
is taken at least twice), the corresponding EGF is 


pee (og 


1 
R*(z) = ———_., 
@) 2+2z-e? 
with the counts starting as 1,0,1,1,7,21,141 (EZS A032032). The dominant singularity is at 
p* defined as the positive root of equation ers p” = 2, and the coefficient rj, satisfies: 


r;, XX (=)" Numerically, this gives 
Tn Dt 1.44269” and rp, Ds.0.87245”", 
with the actual figures for the corresponding logarithms being 


n =logr, -=logr;, 

10 -0.33385 = —0.22508 
20 0.35018 —0.18144 
50. =0.85998 = — 0.154449 
100 0.36325 —0.145447 
co )}~=— 0.86651 = — 0.13644 


(log1/p) (og(1/p") 
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These estimates constitute a weak form of a more precise result to be established later 
in this chapter: If random surjections of size n are taken equally likely, the probability of a 
surjection being a double surjection is exponentially small. 
Derangements. There, for dijn = [z"Je~*(1 — z)~* and do,n = [Ment 20 —z)7' we 
have, from the poles at z = 1, 


din bt1” and dono 1”. 


The upper bound is combinatorially trivial. The lower bound expresses that the probability for a 
random permutation to be a derangement is not exponentially small. For di,n, we have already 
proved by an elementary argument the stronger result di, — e~'; in the case of d2,n, we shall 
establish later the precise asymptotic equivalent d2,, — e~°/?, in accordance with what was 
announced in the introduction. 


Unary-Binary trees. The expression 
l1—z—vV1—-2z—- 32? ‘ 5 
(== Sees 5s mi =z2+ 27 +272 +4274 +92? 4..., 
represents the OGF of (plane unlabelled) unary-binary trees. From the equivalent form, 


_1l-2z-f0—32)0 +2) 


U(z) = Qe ’ 
it follows that U(z) is analytic in the complex plane slit along (, +00) and (—oo, —1) and 
is singular at z = —1 and z = 1/3 where it has branch points. The closest singularity to the 
origin being at 3, one has 
Un 3”. 


In this case, the stronger upper bound U;,, < 3” results directly from the possibility of encoding 
such trees by words over a ternary alphabet using Lukasiewicz codes (Chapter I). A complete 
asymptotic expansion will be obtained in Chapter VIL. .......... END OF EXAMPLE IV.1. 


> IV.14. Coding theory bounds. Let C be a combinatorial class. We say that it can be encoded 
with f(n) bits if, for all sufficiently large values of n, elements of C,, can be encoded as words 
of f(n) bits. Assume that C has OGF C(z) with radius of singularity R satisfying 0 < R <1. 
Then, for any e, C can be encoded with (1 + €)«n bits where « = — log, R, but C cannot be 
encoded with (1 — €)«n bits. 


Similarly, if C has EGF o (z) with radius of convergence R satisfying 0 < R < co,C can 
be encoded with nlog(n/e) + (1 + €)Kn bits where & = — log, R, but C cannot be encoded 
with n log(n/e) + (1 — €)«n bits. Singularities convey information on optimal codes! <q 


Saddle-point bounds. The exponential growth formula (Theorem IV.7) can be 
supplemented by effective upper bounds which are very easy to derive and often turn 
out to be surprisingly accurate. We state: 

Proposition IV.1 (Saddle-Point bounds). Let f(z) be analytic in the disc |z| < R 
with 0 < R < co. Define M(f;r) forr € (0,R) by M(f;r) = sup),\—, |f(2)|- 
Then, one has, for any r in (0, R), the family of saddle point upper bounds 

M(fir) M(fsr) 


(18) [e"|F@) < SP implying [e"If(2) < | inf 


If in addition f(z) has nonnegative coefficients at 0, then 


a) eit) < £2 imptying (2"1F(@) << int 
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PROOF. In the general case of (18), the first inequality results from trivial bounds ap- 
plied to the Cauchy coefficient formula, when integration is performed along a circle: 
‘ 1 dz 
PhUO=a2 ff par 
It is consequently valid for any 7 smaller than the radius of convergence of f at 0. The 

second inequality in (18) plainly represents the best possible bound of this type. 

In the positive case of (19), the bounds can be viewed as a direct specialization 
of (18). (Alternatively, they can be obtained elementarily since, in the case of positive 
coefficients, 

ins Dips Sap IG Snot 
r r 7 
whenever the f;, are nonnegative.) 

Note that the value s that provides the best bound in (19) can be determined by 
cancelling a derivative, 


Sea 


f(s) 

f(s) 
Thanks to the universal character of the first bound, any approximate solution of this 
last equation will in fact provide a valid upper bound. 


(20) 


For reasons well explained by the saddle point method (Chapter VIII), these 
bounds usually capture the actual asymptotic behaviour up to a polynomial factor 
only. A typical instance is the weak form of Stirling’s formula, 

de cy Soret hh, cee 

Al = [z Je < n”? 
which only overestimates the true asymptotic value by a factor of W277. 
> IV.15. A suboptimal but easy saddle-point bound. Let f(z) be analytic in |z| < 1 with 
nonnegative coefficients. Assume that f(x) < (1 — x)~% for some 3 > O and all x € (0,1). 


Then 

[z"]f(z) = O(n’). 
(Better bounds of the form O(n? ~) are usually obtained by the method of singularity analysis 
exposed in Chapter VI.) <q 


EXAMPLE IV.2. Combinatorial examples of saddle point bounds. Here are applications to 
fragmented permutations, set partitions (Bell numbers), involutions, and integer partitions. 


Fragmented permutations. Consider first fragmented permutations defined by F = SET(SEQ;,(Z)) 


in the labelled universe (Chapter II, p. 115). The EGF is e*/—*), and we claim that 
(21) fr, = [g2Je/O-9 < e2VR- F400) 
Wor 7 : 


Indeed, the minimizing radius of the saddle point bound (19) is s such that 


0=4 (y= no :) see 

ds \1l—s : (l-s)2) s° 

The equation is solved by s = (2n + 1 — /4n +1)/(2n). One can either use this exact value 
and compute an asymptotic approximation of f(s)/s", or adopt right away the approximate 
value s; = 1 — 1/,/n, which leads to simpler calculations. The estimate (21) results. It is off 
from the actual asymptotic value only by a factor of order n~°/+ (cf Example VIIL.6, p. 527). 
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n Ly In 

100 0.106579- 10° 0.240533 - 10 4 

200 0.231809-10'°° 0.367247 - 101°? 

300 0.383502. 10746 0.494575 - 10/4 

400 0.869362 -10*44 0.968454 - 1044? 2 

500 0.425391- 10°78 0.423108 - 10°76 e) 7 2 3 


FIGURE IV.5. A comparison of the exact number of involutions J, to its approximation 
Ty = nlev"t"/2n-"/?; [left] a table; [right] a plot of log; g(Un/In) against log), n 


-1/2 


suggesting that the ratio satisfies J, ie ~K-n , the slope of the line being ~ 3. 


Bell numbers and set partitions. Another immediate applications is an upper bound on Bell 
numbers enumerating set partitions, S = SET(SETS;(Z)), with EGF e® ~+. According 
to (20), the best saddle point bound is obtained for s such that se* = n. Thus, 


1 8_4_ 
(22) — Sn < e& gia ae 8s: se =n, 
n! 


where, additionally, s = logn — loglogn + o(log log n). See Chapter VII, p. 525 for the 
complete saddle point analysis. 


Involutions. Involutions are specified by Z = SET(CYC1,2(Z)) and have EGF I(z) = exp(z+ 


427). One determines, by choosing s = \/n as an approximate solution to (20): 
1 Vn+n/2 

(23) ~In < —— 
n! nnr/2 


(See Figure 5 for numerical data and Example VIII.4, p. 524 for a full analysis.) Similar bounds 
hold for permutations with all cycle lengths < k and permutations o such that o* = Id. 


Integer partitions. The function 


ae 1 oz 
(24) re) = Th ae = 20 ( iz] 


is the OGF of integer partitions, an unlabelled analogue of set partitions. Its radius of con- 
vergence is a priori bounded from above by 1, since the set P is infinite and the second form 
of P(z) shows that it is exactly equal to 1. Therefore P, > 1”. A finer upper bound results 
from the estimate 


: Pt Ol 
2 A(t) = log P(e~') ~ Z + log t+ 0(# 
(25) (t) = log P(e’) 6p) 8 Von aa’ t (t"), 


which obtains from Euler—Maclaurin summation or, better, from a Mellin analysis follow- 
ing APPENDIX B: Mellin transform, p. 707. Indeed, the Mellin transform of A is, by the 
harmonic sum rule, 


A*(s) =C(s)¢(s+1)P(s), — s € (1, +00), 
and the successive leftmost poles at s = 1 (simple pole), s = 0 (double pole), and s = —1 
(simple pole) translate into the asymptotic expansion (25). When z — 1”, we have 


—n?/12 2 


from which we derive (choose s = D,/n as an approximate solution to (20)) 


P, < Cerrte me, 


(26) P(z)~ 
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for some C' > 0. This last bound is once more only off by a polynomial factor, as we shall prove 


when studying the saddle point method (Proposition VIIL8, p. 544). END OF EXAMPLE IV.2. 


> IV.16. A natural boundary. One has P(re’®) — oo as r — 17, for any angle 6 that is a 
rational multiple of 27. Points e?*"”/“ being dense on the unit circle, the function P(z) admits 
the unit circle as a natural boundary, i.e., it cannot be analytically continued beyond this circle. 


<J 


IV. 4. Closure properties and computable bounds 


Analytic functions are robust: they satisfy a rich set of closure properties. This 
fact makes possible the determination of exponential growth constants for coefficients 
of a wide range of classes of functions. Theorem I'V.8 below expresses computability 
of growth rate for all specifications associated with iterative specifications. It is the 
first result of this sort that relates symbolic methods of Part A with analytic methods 
developed here. 


Closure properties of analytic functions. The functions analytic at a point z = a 
are closed under sum and product, and hence form a ring. If f(z) and g(z) are ana- 
lytic at z = a, then so is their quotient f(z)/g(z) provided g(a) 4 0. Meromorphic 
functions are furthermore closed under quotient and hence form a field. Such prop- 
erties are proved most easily using complex-differentiability and extending the usual 
relations from real analysis, for instance, (f + g)' = f’ +9’, (fg) =fg¢'+f'g. 

Analytic functions are a closed ea Be ae if f(z) is analytic at z =a 
and g(w) is ae, at b = ), then g o f(z) is analytic at z = a. Graphically: 


eS 


The proof based on complex-differentiability closely mimicks the real case. Inverse 
functions exist conditionally: if f’(a) A 0, then f(z) is locally linear near a, hence 
invertible, so that there exists a g satisfying fog = go f = Id, where Id is the identity 
function, Id(z) = z. The inverse function is itself locally linear, hence complex 
differentiable, hence analytic. In short, the inverse of an analytic function f at a place 
where the derivative does not vanish is an analytic function. 

> IV.17. The analytic inversion lemma. Let f be analytic on Q 5 zo and satisfy f’(zo) 4 0. 
Then there esists a small region Q; C Q containing zo and a C' > 0 such that | f(z) — f(z’)| > 
Clz — 2’ |, for all z, z’ € Q1. Consequently, f maps bijectively 21 on f (M1). 

One way to establish closure properties, as suggested above, is to deduce analyt- 
icity criteria from complex differentiability by way of the Basic Equivalence Theorem 
(Theorem IV.1). An alternative approach, closer to the original notion of analyticity, 
can be based on a two-step process: (7) closure properties are shown to hold true for 
formal power series; (iz) the resulting formal power series are proved to be locally 
convergent by means of suitable majorizations on their coefficients. This is the basis 
of the classical method of majorant series originating with Cauchy. 


[> IV.18. The majorant series technique. Given two power series, define f(z) x g(z) if 
l[ze"] f(z)| < [z"]g(z) for all n > 0. The following two conditions are equivalent: (i) f(z) is 
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analytic in the disc |z| < p; (ii) for any r > p~* there exists a c such that 


~ 
f(z) 3 l—rz 
If f, g are majorized by c/(1—rz), d/(1—rz) respectively, then f +g and f -g are majorized, 
c+d 
~< : ~< 
f@toQs2**,  f@)-0@) 2, 
for any s > r and for some e dependent on s. Similarly, the composition f 0 g is majorized: 
c 
< ——____. 
foglz) = 1—r(1+d)z 
Constructions for 1/f and for the functional inverse of f can be similarly developed. See 
Cartan’s book [79] and van der Hoeven’s study [477] for a systematic treatment. J 


For functions defined by analytic expressions, singularities can be determined 
inductively in an intuitively transparent manner. If Sing(f) and Zero(f) are respec- 
tively the set of singularities and zeros of function f, then, due to closure properties 
of analytic functions, the following informally stated guidelines apply. 


Sing(f+g) C Sing(f) U Sing(g) 

Sing(f xg) © Sing(f) U Sing(g) 
Sing(f/g) C Sing(f) U Sing(g) U Zero(g) 
Sing(fog) C Sing(g) Ug) (Sing(f)) 
Sing(./f) Cc Sing(f) U Zero(f) 
Sing(log(f)) © Sing(f) U Zero(f) 
Sing(fY) C  f(Sing(f)) U f(Zero(f’)). 


A mathematically rigorous treatment would require considering multivalued func- 
tions and Riemann surfaces, so that we do not state detailed validity conditions and, 
at this stage, keep for these formule the status of useful heuristics. In fact, because 
of Pringsheim’s theorem, the search of dominant singularities of combinatorial gener- 
ating function can normally avoid considering the complete multivalued structure of 
functions, since only some initial segment of the positive real half—line needs to be 
considered. This in turn implies a powerful and easy way of determining the expo- 
nential order of coefficients of a wide variety of generating functions, as we explain 
next. 


Computability of exponential growth constants. As defined in Chapters I and II, 
a combinatorial class is constructible or specifiable if it can be specified by a finite 
set of equations involving only the basic constructors. A specification is iterative or 
non-recursive if in addition the dependency graph of the specification is acyclic, that 
is, no recursion is involved and a single functional term (written with sums, products, 
as well as sequence, set, and cycle constructions) describes the specification. 

Our interest here is in effective computability issues. We recall that a real number 
a is computable iff there exists a program II, which on input m outputs a rational 
number a,,, guaranteed to be within +10~™ of a. We state: 


Theorem IV.8 (Computability of growth). Let C be a constructible unlabelled class 
that admits of an iterative specification in terms of (SEQ, PSET, MSET, CYC; +, x) 
starting with (1, Z). Then the radius of convergence po of the OGF C(z) of C is 
either +00 or a (strictly) positive computable real number. 
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Let D be a constructible labelled class that admits of an iterative specification in 
terms of (SEQ, SET, CYC; +, *) starting with (1, Z). Then the radius of convergence 
pp of the EGF D(z) of D is either +co or a (strictly) positive computable real number. 

Accordingly, if finite, the constants pc, pp in the exponential growth estimates, 


[z"|C(z) = Cy, > (=) . Mawes i. (=). 


PC 
are computable numbers. 


PROOF. In both cases, the proof proceeds by induction on the structural specification 
of the class. For each class F, with generating function F(z), we associate a signature, 
which is an ordered pair (97,77), where pp is the radius of convergence of F and tr 
is the value of F' at pr, precisely, 


Tr := lim F(z). 
I Pp 
(The value Tr is well defined as an element of R U {+00} since F’, being a counting 
generating function, is necessarily increasing on (0, pr).) 


Unlabelled case. An unlabelled class G is either finite, in which case its OGF G(z) 
is a polynomial, or infinite, in which case it diverges at z = 1, so that pg < 1. It 
is clearly decidable, given the specification, whether a class is finite or not: a neces- 
sary and sufficient condition is that one of the unary constructors (SEQ, MSET, CYc) 
intervenes in the specification. We prove (by induction) the assertion of the theorem 
together with the stronger property that 77 = oo as soon as the class is infinite. 

First, the signatures of the neutral class 1 and the atomic class Z, with OGF 1 and 
z, are (+00, 1) and (+00, +00). Any nonconstant polynomial which is the OGF of a 
finite set has the signature (+00, +00). The assertion is thus easily verified in these 
cases. 

Next, let F = SEQ(G). The OGF G(z) must be nonconstant and in fact satisfy 
G(0) = 0 in order for the sequence construction to be properly defined. Thus, by the 
induction hypothesis, one has 0 < pg < +00 and Tg = +coo. Now, the function G 
being increasing and continuous along the positive axis, there must exist a value @ 
such that 0 < 8 < pg with G(B) = 1. For z € (0,{), the quasi-inverse F(z) = 
(1 — G(z))~? is well defined and analytic; as z approaches 3 from the left, F(z) 
increases unboundedly. Thus, the smallest singularity of F’ along the positive axis is 
at 3, and by Pringsheim’s theorem, one has pr = (3. The argument simultaneously 
shows that T7 = +00. There only remains to check that G is computable. The 
coefficients of G form a computable sequence of integers, so that G(x), which can be 
well approximated via truncated Taylor series, is an effectively computable number? 
if x is itself a positive computable number less than pg. Then binary search provides 
an effective procedure for determining (3. 


°The present argument only establishes non-constructively the existence of a program, based on the 
fact that truncated Taylor series converge geometrically fast at an interior point of their disc of convergence. 
Making explict this program and the involved parameters from the specification itself however represents a 
much harder problem (that of “uniformity” with respect to specifications) that is not addressed here. 
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Next, we consider the multiset construction, = MSET(G), whose translation 
into OGFs necessitates the Polya exponential of Chapter I (p. 32): 


F(z) = Exp(G(z)) where Exp(h(z)) := exp te + sh(2?) + sh(2°) +. +) : 


Once more, the induction hypothesis is assumed for G. If G is a polynomial, then F’ 
is a rational function with poles at roots of unity only. Thus, pr = 1 and tp = co 
in that particular case. In the general case of F = MSET(G) with G infinite, we start 
by fixing arbitrarily a number r such that 0 < r < pq < 1 and examine F(z) for 
z € (0,r). The expression for F' rewrites as 


Exp(G(z)) = ere). exp (56%) + 562") Hoe +) . 


The first factor is analytic for z on (0, pc) since, the exponential function being entire, 
e© has the singularities of G. As to the second factor, one has G(0) = 0 (in order 
for the set construction to be well-defined), while G(a) is convex for x € [0,1] (since 
its second derivative is positive). Thus, there exists a positive constant KC such that 
G(x) < Kx when x € [0,7]. Then, the series $G(z*) + $G(z*) + --- has its terms 
dominated by those of the convergent series 


= + = +.+.=Klog(1—r)-) — Kr. 

By a well-known theorem of analytic function theory, a uniformly convergent sum of 
analytic functions is itself analytic; consequently, $G(z*) + +G(z?) +--+ is analytic 
at all z of (0,7). Analyticity is then preserved by the exponential, so that F(z), being 
analytic at z € (0,r) for any r < pg has a radius of convergence that satisfies pr > 
pa. On the other hand, since F(z) dominates termwise G(z), one has pr < pq. Thus 
finally one has pr = pq. Also, Tg = +00 implies Tr = +00. 

A parallel discussion covers the case of the powerset construction (PSET) whose 
associated functional Exp is a minor modification of the Pélya exponential Exp. 
The cycle construction can be treated by similar arguments based on consideration 
of “Pélya’s logarithm” as F = Cyc(G) corresponds to 


1 
F(z) = Log i 


1 2 
Taq)’ where Log h(z) = logh(z) + 5 logh(z*)+---. 


In order to conclude with the unlabelled case, there only remains to discuss the 
binary constructors +, x, which give rise to F = G+ H,F =G- H. Itis easily 
verified that p- = min(pc, px). Computability is granted since the minimum of two 
computable numbers is computable. That 77 = +00 in each case is immediate. 


Labelled case. The labelled case is covered by the same type of argument as above, 
the discussion being even simpler, since the ordinary exponential and logarithm re- 
place the Polya operators Exp and Log. It is still a fact that all the EGFs of infinite 
nonrecursive classes are infinite at their dominant positive singularity, though the radii 
of convergence can now be of any magnitude (compared to 1). 
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> IV.19. Relativized constructions. This is an exercise in induction. Theorem I'V.8 is stated for 
specifications involving the basic constructors. Show that the conclusion still holds if the corre- 
sponding relativized constructions (R=,, R<r, R>r With K being any of the basic constructors) 
are also allowed. <q 


> IV.20. Syntactically decidable properties. For unlabelled classes F, the property pr = 1 is 
decidable. For labelled and unlabelled classes, the property p7 = +00 is decidable. dq 


> IV.21. Pélya—Carlson and a curious property of OGFs. Here is a statement first conjectured 
by Polya, then proved by Carlson in 1921 (see [128, p. 323]): If a function is represented by 
a power series with integer coefficients that converges inside the unit disc, then either it is a 
rational function or it admits the unit circle as a natural boundary. This theorem applies in 
particular to the OGF of any combinatorial class. 


> IV.22. Trees are recursive structures only! General and binary trees cannot receive an iter- 
ative specification since their OGFs assume a finite value at their Pringsheim singularity. [The 
same is true of most simple families of treee; cf Proposition IV.5 p. 264]. 


> IV.23. Nonconstructibility of permutations and graphs. The class P of all permutations 
cannot be specified as a constructible unlabelled class since the OGF P(z) = S>, n!z” has 
radius of convergence 0. (It is of course constructible as a labelled class.) Graphs, whether 
labelled or unlabelled, are too numerous to form a constructible class. <J 


Theorem IV.8 establishes a link between analytic combinatorics, computability 
theory, and symbolic manipulation systems. It is based on an article of Flajolet, Salvy, 
and Zimmermann [206] devoted to such computability issues in exact and asymptotic 
enumeration. (Recursive specifications are not discussed now since they tend to give 
rise to branch points, themselves amenable to singularity analysis techniques to be 
developed in Chapters VI and VII.) The inductive process, implied by the proof of 
Theorem I'V.8, that decorates a specification with the radius of convergence of each of 
its subexpressions provides a practical basis for determining the exponential growth 
rate of counts associated to a nonrecursive specification. The example of trains de- 
tailed below is typical. 


EXAMPLE IV.3. Combinatorial trains. This somewhat artificial example from [173] (see 
Figure 6) serves to illustrate the scope of Theorem IV.8 and demonstrate its inner mechanisms 
at work. Define the class of all labelled trains by the following specification, 


Tr = Wax SEQ(Wa x SET(Pa)), 
(27) Wa = SEQsi(P2), 

Pe = Z2xZx(1+Cyc(Z)), 

Pa = Cyc(Z)«Cyc(Z). 


In figurative terms, a train (J) is composed of a first wagon (Wa) to which is appended a 
sequence of passenger wagons, each of the latter capable of containing a set of passengers 
(Pa). A wagon is itself composed of “planks” (P2) determined by their end points (Z * Z) and 
to which a circular wheel (CYC(Z)) may be attached. A passenger is composed of a head and 
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0.48512 


0.48512 


| 
0.68245 


0.68245 


FIGURE IV.6. The inductive determination of the radius of convergence of the EGF of 
trains: (top) a hierarchical view of the specification of Tr; (bottom left) the corresponding 
expression tree of the EGF T'r(z); (bottom right) the value of the radii for each subexpres- 
sion of T'r(z) (with L(y) = log(1 — y)~*, S(y) = (1—y)~", Sa(y) = yS(y)). 


a belly that are each circular arrangements of atoms. Here is a depiction of a random train: 


The translation into a set of EGF equations is immediate and a symbolic manipulation system 
readily provides the form of the EGF of trains as 
-1 


2? (1 +log((1- 2)~")) 2 (1+ log((1 = 2)~1)) elos(a-2)-))" 
Tr(z) = 1 , 
(1-2 (1 + log((1 - 2)~*))) 1-2? (1 + log((1 = 2)~*)) 
together with the expansion 
2 3 “4 ~ 6 Aa 
Tr(z) = 25, +6 a + 607 + 520 = + 6660 + 93408 = +--- ; 


The specification (27) has a hierarchical structure, as suggested by the top representation of 
Figure 6, and this structure is itself directly reflected by the form of the expression tree of the GF 
Tr(z). Then each node in the expression tree of T(z) can be tagged with the corresponding 
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value of the radius of convergence. This is done according to the principles of Theorem IV.8; 
see the bottom-right part of Figure 6. For instance, the quantity 0.68245 associated to Wa(z) 
is given by the sequence rule and is determined as smallest positive solution to the equation 


2 (1 — log(1 — z)~*) = 1: 
The tagging process works upwards till the root of the tree is reached; here the radius of con- 


vergence of T’r is determined to be p = 0.48512---, a quantity that happens to coincide with 
the ratio [2*°|T'r(z)/[z°°|P'r(z) to more than 15 decimal places. END OF EXAMPLE IV.3. 


IV.5. Rational and meromorphic functions 


The last section has fully justified the First Principle of coefficient asymptotics 
leading to the exponential growth formula f,, > A” for the coefficients of an analytic 
function f(z). Indeed, as we saw, one has A = 1/p, where p equals both the radius of 
convergence of the series representing f and the distance of the origin to the dominant, 
ie., closest, singularities. We are going to start examining here the Second Principle, 
already quoted on p. 215 and relative to the form, 


fn = A” O(n), 
with 6(n) the subexponential factor: 


Second Principle of Coefficient Asymptotics. The nature of the function’s 
singularities determines the associate subexponential factor (0(n)). 


In this section, we develop a complete theory in the case of rational functions (that is, 
quotients of polynomials) and, more generally, meromorphic functions. The net result 
is that, for such functions, the subexponential factors are essentially polynomials: 


Polar singularities ~» Subexponential factors 6(n) are of polynomial growth. 


A distinguishing feature is the extremely good quality of the asymptotic approxima- 
tions obtained; for naturally occuring combinatorial problems, 15 digits of accuracy is 
not uncommon in coefficients of index as low as 50 (see Figure 7 below for a striking 
example). 


IV.5.1. Rational functions. A function f(z) is a rational function iff it is of the 
form f(z) = N(z)/D(z), with N(z) and D(z) being polynomials, which we may 
without loss of generality assume to be relatively prime. For rational functions that 
are analytic at the origin (e.g., generating functions), we have D(0) 4 0. 

Sequences {fn}n>0 that are coefficients of rational functions satisfy linear re- 
currence relations with constant coefficients. This fact is easy to establish: com- 
pute [z”] f(z) - D(z); then, with D(z) = dp + diz +--+ + dmz™, one has, for 
alln > deg(N(z)), 


So dag 0: 
j=0 


The main theorem we prove here provides an exact finite expression for coeffi- 
cients of f(z) in terms of the poles of f(z). Individual terms in these expressions are 
sometimes called exponential polynomials. 
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Theorem IV.9 (Expansion of rational functions). If f(z) is a rational function that is 
analytic at zero and has poles at points a, Q2,..., Am, then its coefficients are a sum 
of exponential polynomials. there exist m polynomials {I1;(x)}71 such that, for n 
larger than some fixed no, 

m 
(28) fn = [2"|f(z) = DOT (n)a;”. 


j=l 
Furthermore the degree of 11; is equal to the order of the pole of f at a; minus one. 


PROOF. Since f(z) is rational it admits a partial fraction expansion. To wit: 


where Q(z) is a polynomial of degree no := deg(N) — deg(D) if f = N/D. There a 
ranges over the poles of f(z) and r is bounded from above by the multiplicity of a as 
a pole of f. Coefficient extraction in this expression results from Newton’s expansion, 
Pe = ie = ET om 
z-a) a (hes) a r—1 

The binomial coefficient is a polynomial of degree r — 1 in n, and collecting terms 
associated with a given a yields the statement of the theorem. 

Notice that the expansion (28) is also an asymptotic expansion in disguise: when 
grouping terms according to the a’s of increasing modulus, each group appears to be 
exponentially smaller than the previous one. In particular, if there is a unique dominant 
pole, Jay| < Jaz| < |a3| <---, then 


fn ~ "Th (n), 


and the error term is exponentially small as it is O(a;"n") for some r. A classical 
instance is the OGF of Fibonacci numbers, 


az 
eer. 
= | 
with poles at Swe = 0.61803 and se = —1.61803, so that 
1 1 yn 


Fy 


with y = (1+ V5)/2 the golden ratio, and ¢ its conjugate. 


> IV.24. A simple exercise. Let f(z) be as in Theorem IV.9, assuming additionally a unique 
dominant pole a, of multiplicity r. Then, by inspection of the proof of Theorem IV.9: 


= - 1 
fn = at grt (1+0 ()) with C= jim (2 — a1)" f(z). 
This is certainly the most direct illustration of the Second Principle: under the assumptions, a 
one-term asymptotic expansion of the functon at its dominant singularity suffices to determine 
the asymptotic form of the coefficients. <q 
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EXAMPLE IV.4. Qualitative analysis of a rational function. This is an artificial example 
designed to demonstrate that all the details of the full decomposition are usually not required. 


The rational function i 
f(y) = -— 
e (1 — 28)9(1 — 2?)3(1— 4) 
has a pole of order 5 at z = 1, poles of order 2 at z = w,w? (w =e 
a pole of order 3 at z = —1, and simple poles at z = +,/2. Therefore, 


fn = Pr(n) + Pa(njw” + Ps(n)w*” + Pa(n)(—1)"+ 
+Ps(n)2-"/? + Po(n)(—1)"'2-"/? 
where the degrees of P;,..., Pe are respectively 4,1,1,2,0,0. For an asymptotic equivalent 
of fn, only the poles at roots of unity need to be considered since they corresponds to the fastest 
exponential growth; in addition, only z = 1 needs to be considered for first order asymptotics; 


finally, at z = 1, only the term of fastest growth needs to be taken into account. In this way, we 
find the correspondence 


tl) 1 ee eed f 1 n+4 né 
32.28. (4) (1-2) "" g2.93.(2) | 4 864" 
The way the analysis can be developed without computing details of partial fraction expansion 
AS:typiCal. <6 .h ie. Sore Bate ods lee dale dae coe ates Sareea END OF EXAMPLE IV.4. 


7/3 a cubic root of unity), 


Theorem IV.9 applies to any specification leading to a GF that is a rational func- 
tion'®. Combined with the qualitative approach to rational coefficient asymptotics, 
it gives access to a large number of effective asymptotic estimates for combinatorial 
counting sequences. 

EXAMPLE IV.5. Asymptotics of denumerants. Denumerants are integer partitions with sum- 
mands restricted to be from a fixed finite set (Chapter I, p. 41). We let P? be the class relative 
to set J C Zso, with the known OGF, 


PT (2) = T] >. 


wET 
A particular case is the one of integer partitions whose summands are in {1, 2,...,r}, 
7 ol 
{ibd} = 
m=1 


The GF has all its poles that are roots of unity. At z = 1, the order of the pole is r, and one has 
1 1 

“rl dz)’ 

as z — 1. Other poles have smaller multiplicity: for instance the multiplicity of z = —1 is 
equal to the number of factors (1— z/)~+ in P{+--"}, that is |r /2]; in general a primitive qth 
root of unity is found to have multiplicity |r /q|. There results that z = 1 contributes a term of 
the form n”~' to the coefficient of order n, while each of the other poles contributes a term of 
order at most n|"/?!, We thus find 


5 epee ae ee r-1 : = sae 
Ph crn with c, = "WG Di 


10Th Part A, we have been occasionally led to discuss coefficients of rational functions, thereby antic- 
ipating the statement of the theorem: see for instance the discussion of parts in compositions (p. 157) and 
of records in sequences (p. 178). 
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The same argument provides the asymptotic form of P7, since, to first order asymptotics, 
only the pole at z = 1 counts. One then has: 


Proposition IV.2. Let T be a finite set of integers without a common divisor (gcd(T) = 1). 
The number of partitions with summands restricted to T satisfies 


Goi with T= IL»: r :=card(T). 
T(r—1)! 


For instance, in a strange country that would have pennies (1 cent), nickels (5 cents), dimes 
(10 cents), and quarters (25 cents), the number of ways to make change for a total of n cents is 


[2”] 1 1 n3 = n3 
(1 — z)(1— 2®)(1 — 229)(1 — 225) 1-5 - 10-25 3! ~~ 7500’ 
asymptotically. 2! ese eww db de he sleneles Shaken hha ee cles Ae tude END OF EXAMPLE IV.5. 


IV.5.2. Meromorphic Functions. An expansion similar to that of Theorem IV.9 
holds true for coefficients of a larger class—meromorphic functions. 


Theorem IV.10 (Expansion of meromorphic functions). Let f(z) be a function mero- 
morphic for |z| < R with poles at points a1, a2,...,Qm, and analytic at all points of 
|z| = Rand at z = 0. Then there exist m polynomials {I1; (x) } 7" such that: 


(29) i= = SOTj(n + O(R™). 


j=l 
Furthermore the degree of 11; is equal to the order of the pole of f at a; minus one. 


PROOF. We offer two different proofs, one based on subtracted singularities, the other 
one based on contour integration. 


(i) Subtracted singularities. Around any pole a, f(z) can be expanded locally: 


(30) f(z) = S> cap(z—a)* 


k>—-M 
(31) = S,(z)+ Ha(z) 


where the “singular part” S,,(z) is obtained by collecting all the terms with index in 
[—M.. — 1] (that is, forming S.(z) = Na(z)/(z — a)” with Nq(z) a polynomial 
of degree less than M) and H,,(z) is analytic at a. Thus setting S(z) := >7; Sa; (2), 
we observe that f(z) — S(z) is analytic for |z| < R. In other words, by collecting 
the singular parts of the expansions and subtracting them, we have “removed” the sin- 
gularities of f(z), whence the name of method of subtracted singularities sometimes 
given to the method [265, vol. 2, p. 448]. 
Taking coefficients, we get: 


[2"| f(z) = 2")S() + 2™(f() — S(2)). 


The coefficient of [z”] in the rational function S(z) is obtained from Theorem IV.9. It 
suffices to prove that the coefficient of z” in f(z) — S(z), a function analytic for |z| < 
R, is O(R~"). This fact follows from trivial bounds applied to Cauchy’s integral 
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formula with the contour of integration being A = {z/|z| = R}, as in the proof of 
Theorem IV.7: 


1 


er) - 8@)] = Ae i>. Oe) 


[_,f@-s@ soa| S separ. 


(it) Contour integration. There is another line of proof for Theorem IV.10 which 
we briefly sketch as it provides an insight which is useful for applications to other 
types of singularities treated in Chapter VI. It consists in using directly Cauchy’s 
coefficient formula and “pushing” the contour of integration past singularities. In 
other words, one computes directly the integral 


1 dz 
In eT aS 
Qin age gntl 


20 


by residues. There is a pole at z = 0 with residue f,, and poles at the a; with residues 
corresponding to the terms in the expansion stated in Theorem IV.10; for instance, if 
f(z) ~ c/(z— a) as z > a, then 

Rea(f(z)27"" <3 =a)= Ress Z=a)= 
Finally, by the same trivial bounds as before, J, is O(R~"). 
> IV.25. Effective error bounds. The error term O(R~") in (29), call it en, satisfies 


len] < sup |f(z)]- 
|z|=R 


c 
qntl ; 


This results immediately from the second proof. This bound may be useful, even in the case of 
rational functions. <q 


EXAMPLE IV.6. Surjections. These are defined as sequences of sets (R = SEQ(SET>1(Z))) 
with EGF R(z) = (2 — e*)~* (see p. 98). We have already determined the poles, the one 
of smallest modulus being at log2 = 0.69314. At this dominant pole, one finds R(z) ~ 
—4(z — log 2)~+. This implies an approximation for the number of surjections: 
1 \n4i 

sae 


Ry = n'[z"]R(z) ~ E(n), with £€(n) := a : (iced 


2 


Here is, forn = 2,4,...,32, a table of the values of the surjection numbers (left) compared 
with the asymptotic approximation rounded'! to the nearest integer, [€(n)|: It is piquant to 
see that [€(n)| provides the exact value of R,, for all values of n = 1,...,15, and it starts 
losing one digit for n = 17, after which point a few “wrong” digits gradually appear, but in 
very limited number; see Figure 7. (A similar situation holds for tangent numbers discussed 
in our Invitation, p. 4.) The explanation of such a faithful asymptotic representation owes to 
the fact that the error terms provided by meromorphic asymptotics are exponentially small. In 
effect, there is no other pole in |z| < 6, the next ones being at log 2 + 2i7 with modulus of 
about 6.32. Thus, for r, = [z”]R(z), there holds 
Rn 1 Le Ate ae 
For the double surjection problem, R*(z) = (2+ z— e”), we get similarly 


PRA. syn, 


(32) 


'lThe notation [a | represents 2 rounded to the nearest integer: [a] := |a + aE 
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313 

75|75 

4683 | 4683 
545835 |545835 
102247563 | 102247563 
28091567595 | 28091567595 
10641342970443 | 10641342970443 
5315654681981355 |5315654681981355 
3385534663256845323 | 338553466325684532 6 
2677687796244384203 115 |2677687796244384203 088 

2574844419803 190384544203 | 2574844419803 190384544 450 


2958279 121074145472650648875 
4002225759844 168492486 127539083 
62975620649500660335 18373935334635 
1140356879401 1880483742464 196184901963 
23545 154085734896649 184490637 144855476395 


2958279 12107414547265064 6597 
4002225759844 1684924861275 55859 
62975620649500660335 18373935 416161 
1140356879401 188048374246419617 4527074 
23545 154085734896649 18449063714 5314147690 


FIGURE IV.7. The surjection numbers pyramid: for n = 2,4,...,32, the exact values 
of the numbers R,, (left) compared to the approximation [€(n) | with discrepant digits in 
boldface (right). 


with p* = 1.14619 the smallest positive root of eps p’ = 2... END OF EXAMPLE IV.6. 


It is worth reflecting on this example as it is representative of a production chain 
based on the two successive implications reflecting the spirit of Part A and Part B of 
the book: 

2 248 
- 2-e 
at Rn ~ 3 (log 2)-"?. 


R = SEQ(SETs1(Z)) 
1 1 
z—log2 2 (z—log2) 


R(z) 
R(z 


There the first implication (written ‘==>’ as usual) is provided automatically by the 
symbolic method. The second one (written here ‘~»’) is a formal translation from the 
expansion of the GF at its dominant singularity to the asymptotic form of coefficients, 
validity being granted by complex-analytic conditions. 


EXAMPLE IV.7. Alignments. These are sequences of cycles (O = SEQ(CYC(Z)), p. 110) 


with EGF ; 


= ie log(1 — z)7!" 

There is a singularity when log(1 — z)~' = 1, which is at p = 1 — e~' and arises before 
z = 1 where the logarithm becomes singular. Then, the computation of the asymptotic form of 
[z”]O(z) only requires a local expansion near p, 


O(z) 


-1 <4 
z—-1+e} (1 — e-1)r41? 


and the coefficient estimates result from Theorem IV.10. ........ END OF EXAMPLE IV.7. 


O(z) ~ = [z”]O(z) ~ 


> IV.26. Some “supernecklaces”. One estimates 


me 1 1 —1\-n 
log {| ——-~——  ] w» =(1 — 
"I be (—L) =(1- ety”, 
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where the EGF enumerates labelled cycles of cycles (supernecklaces, p. 115). [Hint: Take 
derivatives. ] J 


EXAMPLE IV.8. Generalized derangements. The probability that the shortest cycle in a 
random permutation of size n has length larger than k is 


[z"]D™) (2), where D‘*)(z) = 


as results from the specification D‘”) = SET(Cycs,(Z)). For any fixed k, one has (easily) 
D(z) ~ e~#e /(1 — z) as z > 1, with | being a simple pole. Accordingly the coefficients 
[2”|D) (z) tend to e~#* as n — oo. Thus, due to meromorphy, we have the characteristic 
implication 


k ea 
p< \(z) ~5 


= [2"]D™ (z) ~ ee, 


Since there is no other singularity at a finite distance, the error in the approximation is (at least) 
exponentially small, 


Zz 22 zk 
-2-28 70. 1 
(33) [2"] —_—_—— =e #* 4 O(R™), 
l-—z 
for any R > 1. The cases k = 1, 2 in particular justify the estimates mentioned at the beginning 
of this chapter, on p. 216. 20.0... eee eee eee END OF EXAMPLE IV.8. 


This example is also worth reflecting upon. In prohibiting cycles of length < k, 
we modify the EGF of all permutations, (1 — z)~! by a factor e~*/1~"-*"/*, The 
resulting EGF is meromorphic at 1; thus only the value of the modifying factor at 
z = 1 matters, so that this value, namely e*, provides the asymptotic proportion 
of k-derangements. We shall encounter more and more shortcuts of this sort as we 
progress into the book. 


> IV.27. Shortest cycles of permutations are not too long. Let S, be the random variable 
denoting the length of the shortest cycle in a random permutation of size n. Using the circle 


|z| = 2 to estimate the error in the approximation e~”* above, one finds that, for k < log n, 
1 gk+1 
gee 
which is exponentially small in this range of k-values. Thus, the approximation e~““* remains 
good when k is allowed to tend sufficiently slowly to co with n. One can also explore the 


possibility of better bounds and larger regions of validity of the main approximation. (See 
Panario and Richmond’s study [385] for a general theory of smallest components in sets.) < 


P(Sn >k)—e #*| < 


> IV.28. Expected length of the shortest cycle. The classical approximation of the harmonic 
numbers, H;, ~ log k + 7 suggests e~ 7 /k as a possible approximation to (33) for both large n 
and large & in suitable regions. In agreement with this heuristic argument, the expected length 
of the shortest cycle in a random permutation of size n is effectively asymptotic to 


aes | 
a —— ~e ‘logn, 
k=1 
a property first discovered by Shepp and Lloyd [436]. J 


The next example illustrates the analysis of a collection of rational generating 
functions (Smirnov words) paralleling nicely the enumeration of a special type of 
integer composition (Carlitz compositions) that resorts to meromorphic asymptotics. 
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EXAMPLE IV.9. Smirnov words and Carlitz compositions. Bernoulli trials have been dis- 
cussed in Chapter III, in relation to weighted word models. Take the class W of all words over 
an r-ary alphabet, where letter 7 is assigned probability p; and letters of words are drawn inde- 
pendently. With this weighting, the GF of all words is W(z) = 1/(1 — So pjz) = (1—2)7t. 
Consider the problem of determining the probability that a random word of length n is of 
Smirnov type, that is, all blocks of length 2 are formed with distinct letters. In order to avoid 
degeneracies, we impose r > 3 (since for r = 2, the only Smirnov words are ababa...and 
babab...). 

By our discussion of Section III.7 (p. 193), the GF of Smirnov words (again with the 
probabilistic weighting) is 


S(z) = — 


Piz 
be Ds Ttpj2 

By monotonicity of the denominator, this rational function has a unique dominant singularity 

at p such that 


(34) aaa 


and z = pis a simple pole. Consequently, p is a well-characterized algebraic number defined 
implicitly by an equation of degree r. There results that the probability for a word to be Smirnov 
is (not too surprisingly) exponentially small, with the precise formula being 


e -1 
n =n Pip 
z |S(z)~C-p ", C= —— é 
eek) (>: oe 
A similar analysis, using bivariate generating functions, shows that in a random word of length n 
conditioned to be Smirnoy, the letter 7 appears with asymptotic frequency 


1 Pi . Pj 
35 a, = oe 
oe & = O04 pip? : PD (1 + p;p)? 


in the sense that the mean number of occurrences of letter 7 is asymptotic to qjn. All these 
results are seen to be consistent with the equiprobable letter case p; = 1/r, for which p = 
r/(r—1). 

Carlitz compositions illustrate a limit situation, in which the alphabet is infinite, while 
letters have different sizes. Recall that a Carlitz composition of the integer n is a composition 
of n such that no two adjacent summands have equal value. By Note III.31, p. 190, such 
compositions can be obtained by substitution from Smirnov words, to the effect that 


(36) «= (1-275) 


j=l 


The asymptotic form of the coefficients then results from an analysis of dominant poles. The 
OGF has a simple pole at p, which is the smallest positive root of the equation 


37 = ill 
(37) Lite 


(Note the analogy with (34) due to commonality of the combinatorial argument.) Thus: 


Kn~C-B", C= 0.45636 34740, 6 = 1.75024 12917. 
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FIGURE IV.8. The coefficients lz" f (2), where f(z) = 
(1 + 1.02z*) Ts (1 _ 1.052z°) ~* illustrate a periodic superposition of smooth behaviours 
that depend on the residue class of n modulo 20. 


There, @ = 1/p with p as in (37). In a way analogous to Smirnov words, the asymptotic 
frequency of summand k appears to be proportional to kp” /(1 + p*)?; see [296, 344] for 
further properties: wis. eee chde theese Sree cedbuwheweees os END OF EXAMPLE IV.9. 


IV. 6. Localization of singularities 


There are situations where a function possesses several dominant singularities, 
that is, several singularities are present on the boundary of the disc of convergence. 
We examine here the induced effect on coefficients and discuss ways to localize such 
dominant singularities. 


IV. 6.1. Multiple singularities. In the case when there exists more than one 
dominant singularity, several geometric terms of the form 3” sharing the same mod- 
ulus (and each carrying its own subexponential factor) must be combined. In simpler 
situations, such terms globally induce a pure periodic behaviour for coefficients that is 
easy to describe. In the general case, irregular fluctuations of a somewhat arithmetic 
nature may prevail. 

Pure periodicities. When several dominant singularities of f(z) have the same 
modulus and are regularly spaced on the boundary of the disc of convergence, they 
may induce complete cancellations of the main exponential terms in the asymptotic 
expansion of the coefficient f,,. In that case, different regimes will be present in the 
coefficients f,, based on congruence properties of n. For instance, the functions 

= ee 
exhibit patterns of periods 4 and 3 respectively, this corresponding to poles that are 
roots of unity or order 4 (+7), and 3 (w : w? = 1). Accordingly, the function 

1 1 Q—77 +234 244+ 22429 — 71 
{+2 1-28 1-2! 


=1—27 + 24-284 28_-..., 


o(z) 
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FIGURE IV.9. The coefficients of f = 1/(1— $2 + z*) exhibit an apparently chaotic 
behaviour (left) which in fact corresponds to a discrete sampling of a sine function (right), 
reflecting the presence of two conjugate complex poles. 


has coefficients that obey a pattern of period 12 (for example, the coefficients ¢,, such 
that n = 1,5,6,7, 11 modulo 12 are zero). Accordingly, the coefficients of 


% 1 

1— 2/2’ 
manifest a different exponential growth when n is congruent to 1,5, 6, 7,11 mod 12. 
See Figure 8 for such a superposition of pure periodicities. In many combinatorial 


applications, generating functions involving periodicities can be decomposed at sight, 
and the corresponding asymptotic subproblems generated are then solved separately. 


[2"]¥(z) where —*h(z) = 92) 


> IV.29. Decidability of polynomial properties. Given a polynomial p(z) € Q|z], the following 
properties are decidable: (2) whether one of the zeros of p is a root of unity; (iz) whether one 
of the zeros of p has an argument that is commensurate with 7. [One can use resultants. An 
algorithmic discussion of this and related issues is given in [247].] <q 


Nonperiodic fluctuations. As a representative example, consider the polynomial 
D(z) =1— $2 + z?, whose roots are 
3. COA _ 3 4 
ee eee oe ee 
both of modulus 1 (the numbers 3, 4, 5 form a Pythagorean triple), with argument +6 
where 6) = arctan($) = 0.92729. The expansion of the function f(z) = 1/D(z) 
starts as 


1 6 11, 84 , 779 4 2574 
ae lt ett He —z —z —— te, 
1— 32+ z 5 25 125 625 3125 


the sign sequence being 


+4+4+—-—-+++4++4+ +++ +++ +++ ; 


which indicates a somewhat irregular oscillating behaviour, where blocks of 3 or 4 
pluses follow blocks of 3 or 4 minuses. 
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The exact form of the coefficients of f results from a partial fraction expansion: 


a b i 8 23 
LAST ga tae, oe 2° 8 


where a = ce”, @ = e~ Accordingly, 


sin((n + 1)0o) 
sin(@) 


This explains the sign changes observed. Since the angle 6 is not commensurate with 
7, the coefficients fluctuate but, unlike in our earlier examples, no exact periodicity is 
present in the sign patterns. See Figure 9 for a rendering and Figure 13 of Chapter V 
(p. 319) for a meromorphic case linked to compositions into prime summands. 

Complicated problems of an arithmetical nature may occur if several such sin- 
gularities with non—commensurate arguments combine, and some open problem even 
remain in the analysis of linear recurring sequences. (For instance no decision proce- 
dure is known to determine whether such a sequence ever vanishes.) Fortunately, such 
problems occur infrequently in combinatorial applications, where dominant poles of 
rational functions (as well as many other functions) tend to have a simple geometry as 
we explain next. 


(38) fn = ae + bein — 


> IV.30. Irregular fluctuations and Pythagorean triples. The quantity =o is an irrational 
number, so that the sign fluctuations of (38) are “irregular” (i.e., non purely periodic). [Proof: 
a contrario. Indeed, otherwise, a = (3 + 4i)/5 would be a root of unity. But then the minimal 
polynomial of a would be a cyclotomic polynomial with nonintegral coefficients, a contradic- 
tion; see [327, VIII.3] for the latter property. ] <q 


> IV.31. Skolem-Mahler-Lech Theorem. Let fn be the sequence of coefficients of a rational 
function, f(z) = A(z)/B(z), where A,B € Q[z]. The set of all n such that f, = 0 is 
the union of a finite (possibly empty) set and a finite number (possibly zero) of infinite arith- 
metic progressions. (The proof is based on p-adic analysis, but the argument is intrinsically 
nonconstructive; see [371] for an attractive introduction to the subject and references.) <J 


Periodicity conditions for positive generating functions. By the previous dis- 
cussion, it is of interest to locate dominant singularities of combinatorial generating 
functions, and, in particular, determine whether their arguments (the “dominant direc- 
tions”) are commensurate to 27. In the latter case, different asymptotic regimes of the 
coefficients manifest themselves, depending on congruence properties of n. 

First a few definitions. For a sequence (f,,) with GF f(z), the support of f, 
denoted Supp(f), is the set of all n such that f,, 4 0. The sequence (also its GF) is 
said to admit span, or period, dif for some r, there holds 


Supp(f) Cr+ dZso0 = {r, r+d, r+ 2d,...}. 


In that case, if f is analytic at 0, then there exists a function g analytic at 0 such that 
f(z) = 2"g(z“). The largest span, p, is often plainly referred to as the period, all 
other spans being divisors of p. With EF’ := Supp(/f), this maximal span is attainable 
as p = gcd(E — E) (pairwise differences) as well as p = gcd(E — {r}) where 
r := min(E). For instance sin(z) has period 2, cos(z) + cosh(z) has period 4, z3e2” 
has period 5, and so on. 
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FIGURE IV.10. Illustration of the “Daffodil Lemma”: the images of circles z = Re’? 
(R = 0.4. .0.8) rendered by a polar plot of | f(z)| in the case of f(z) = ge 4: 2/(1- 
z1°)), which has span 5. 


In the context of periodicities, a basic property is expressed by what we have 
chosen to name figuratively the “Daffodil Lemma”. By virtue of this lemma, the span 
of a function f with nonnegative coefficients is related to the behaviour of | f(z)| as z 
varies along circles centred at the origin (Figure 10). 


Lemma IV.1 (“Daffodil Lemma”). Let f(z) be analytic in |z| < p and have nonneg- 
ative coefficients at 0. Assume that f does not reduce to a monomial and that for some 
nonzero nonpositive z satisfying |z| < p, one has 


If(2)| = F(lz))- 


Then, the following hold: (i) the argument of z must be commensurate to 2n, i.e., 
z = Re” with 0/(27) = = © Q(an irreducible fraction) and0 <r < p; (tt) f 
admits p as a span. 


PROOF. This classical lemma is a simple consequence of the strong triangle inequality. 
Indeed, with z = Re’®, the equality | f(z)| = f(|z|) implies that the complex numbers 
frR°e' for n € Supp(f) all lie on the same ray (a half-line emanating from 0). 
This is impossible if 6 is irrational, as soon as the expansion of f contains at least two 
monomials. 


Berstel [43] first realized that rational generating functions arising from regular 
languages can only have dominant singularities of the form pw, where w is a certain 
root of unity. This property in fact extends to many nonrecursive specifications, as 
shown by Flajolet, Salvy, and Zimmermann in [206]. 


Proposition IV.3 (Commensurability of dominant directions). Let S be a constructible 
labelled class that is nonrecursive, in the sense of Theorem IV.8. Assume that the 
EGF S(z) has a finite radius of convergence p. Then there exists a computable inte- 
ger d > 1 such that the set of dominant singularities of S'\(z) is contained in the set 
{pw}, where wt = 1. 
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PROOF. (Sketch; see [43, 206]) By definition, a nonrecursive class S is obtained 
from 1 and Z by means of a finite number of union, product, sequence, set, and 
cycle constructions. We have seen earlier, in Section IV. 4, an inductive algorithm that 
determines radii of convergence. It is then easy to enrich that algorithm and determine 
simultaneously (by induction on the specification) the period of its GF and the set of 
dominant directions. 

The period is determined by simple rules. For instance, if S = TxU(S =T-U) 
and T’, U are infinite series with respective periods p, q, one has the implication 


Supp(T) Ca+pZ, Supp(U) C b+ qZ => Supp(S) Ca+b+4 €Z, 
with € = gcd(p, q). Similarly, for S = SEQ(T), 
Supp(T) C a+ pZ => Supp(S) C dZ, 


where now 6 = gcd(a, p). 

Regarding dominant singularities, the case of a sequence construction is typical. 
It corresponds to g(z) = (1 — f(z))~!. Assume that f(z) = z“h(z?), with p the 
maximal period, and let p > 0 be such that f() = 1. The equations determining 
any dominant singularity ¢ are f(¢) = 1, |¢| = p. In particular, the equations imply 
|f(¢)| = f(I¢|), so that, by the Daffodil Lemma, the argument of ¢ must be of the 
form 27r/s. An easy refinement of the argument shows that, for 6 = gcd(a, p), all the 
dominant directions coincide with the multiples of 27/6. The discussion of cycles is 
entirely similar since log(1 — f)~! has the same dominant singularities as (1 — f)~?. 
Finally, for exponentials, it suffices to observe that ef does not modify the singularity 
pattern of f, since exp(z) is an entire function. 


> IV.32. Daffodil lemma and unlabelled classes. Proposition IV.3 applies to any unlabelled 
class S that admits a nonrecursive specification, provided its radius of convergence p satisfies 
p < 1. (When p = 1, there is a possibility of having the unit circle as a natural boundary—a 
property that is otherwise decidable.) <q 


Exact formule. The error terms appearing in the asymptotic expansion of coef- 
ficients of meromorphic functions are already exponentially small. By peeling off the 
singularities of a meromorphic function layer by layer, in order of increasing modulus, 
one is led to extremely precise—or even exact—expansions for the coefficients. Such 
exact representations are found for Bernoulli numbers B,,, surjection numbers R,,,, as 
well as Secant numbers F2,, and Tangent numbers E2,,41, defined by 


S- B= = is i (Bernoulli numbers) 
! ee = 

n=0 

oe cae 1 Me. 

S- Ry— = (Surjection numbers) 

ant n! 2 —e” 

= zen 1 

se Ea. = (Secant numbers) 

oar (2n)! cos(z) 

oo antl 

S- Fonsi Qn+i) =  tan(z) (Tangent numbers). 
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Bernoulli numbers. These numbers traditionally written B,, can be defined by their 
EGF B(z) = z/(e* — 1). The function B(z) has poles at the points yx = 2ik7, with 
k € Z\ {0}, and the residue at y;, is equal to vx, 
A. Re = 
e*>—-1l 2z-Xk 


(z > Xr). 


The expansion theorem for meromorphic functions is applicable here: start with the 
Cauchy integral formula, and proceed as in the proof of Theorem IV.10, using as 
external contours a large circle of radius FR that passes half way between poles. As R 
tends to infinity, the integrand tends to 0 (as soon as n > 2) because the Cauchy kernel 
z—"~1! decreases as an inverse power of R while the EGF remains O(R). In the limit, 
corresponding to an infinitely large contour, the coefficient integral becomes equal to 
the sum of all residues of the meromorphic function over the whole of the complex 
plane. 

From this argument, we get the representation B, = —n! > kEZ\{0} x,” This 
verifies that B,, = Oif n is odd and n > 3. If n is even, then grouping terms two by 
two, we get the exact representation (which also serves as an asymptotic expansion): 

Bon 


1 
— (_1)r- lo1l—-2n,—2n 
(39) Onl (—1)""°2 1 d Fan’ 


Reverting the equality, we have also established that 


Pon with C(s) Se By, = n\[2"]|— 


— (_1)n—-192n-1_2n 
¢(2n) = ( 1) 2 T (2n)! a ks’ ez — 1’ 


a well-known identity that provides values of the Riemann zeta function ¢(s) at even 
integers as rational multiples of powers of z. 
Surjection numbers. In the same vein, the surjection numbers have EGF R(z) = 
(2 — e*)~1 with simple poles at 

1 1 


Xe = log 2 + 2ika where R(z)~ = : 
2Xk—-2 


Since R(z) stays bounded on circles passing half way in between poles, we find the 
exact formula, R, = $n! ohez Phlmee An equivalent real formulation is 


Bee SE BONE cos((n + 1)0x) Qkhr 
7) yes eee ge pe ie 
( 0) n! 2 (=) ae (log? 2 + 4k2n2)(n+1)/2’ k arc an(Tg2” 


which exhibits infinitely many harmonics of fast decaying amplitude. 
> IV.33. Alternating permutations, tangent and secant numbers. The relation (39) also provides 


a representation of the tangent numbers since Exn—-1 = (—1)"~* Ban4”"(4" — 1)/(2n). The 
secant numbers E2,, satisfy 


ee oe pay 
> ep = 2 (2n)! Heh 


k=1 
which can be read either as providing an asymptotic expansion of £2, or as an evaluation of the 


sums on the left (the values of a Dirichlet L-function) in terms of 7. The asymptotic number of 
alternating permutations (Chapter II) is consequently known to great accuracy. <q 
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> IV.34. Solutions to the equation tan(x) = x. Let xn be the nth positive root of the equation 
tan(x) = a. For any integer r > 1, the sum S(r) := 30>, 2,7" is a computable rational 
number. [From folklore and The American Mathematical Monthly.] <q 


IV. 6.2. Localization of zeros and poles. We gather here a few results that often 
prove useful in determining the location of zeros of analytic functions, and hence of 
poles of meromorphic functions. A detailed treatment of this topic may be found in 
Henrici’s book [265]. 

Let f(z) be an analytic function in a region 2. and let + be a simple closed curve 
interior to 2, and on which f is assumed to have no zeros. We claim that the quantity 


(41) N(f37) = : [oe dz 
7 


~ Bir JF) 


exactly equals the number of zeros of f inside 7 counted with multiplicity. [Proof: the 
function f’/f has its poles exactly at the zeros of f, and the residue at each pole a 
equals the multiplicity of a as a root of f; the assertion then results from the residue 
theorem. ] 

Since a primitive function of f’/f is log f, the integral also represents the vari- 
ation of log f along 7, which is written [log f],. This variation itself reduces to 2ia 
times the variation of the argument of f along ¥, since log(re’’) = log r + 70 and the 
modulus r has variation equal to 0 along a closed contour ([log r], = 0). The quantity 
[6], is, by its definition, 27 multiplied by the number of times the transformed contour 
f(y) winds about the origin. This observation is known as the Argument Principle: 


Argument Principle. The number of zeros of f(z) (counted with multiplic- 
ities) inside y equals the winding number of the transformed contour f (7) 
around the origin. 


By the same argument, if f is meromorphic in 2 5 ¥, then N(f;7) equals the differ- 
ence between the number of zeros and the number of poles of f inside 7, multiplicities 
being taken into account. Figure 11 exemplifies the use of the argument principle in 
localizing zeros of a polynomial. 

By similar devices, we get Rouché’s theorem: 


Rouché’s theorem. Let the functions f(z) and g(z) be analytic in a region 
containing in its interior the closed simple curve y. Assume that f and g 
satisfy |g(z)| < |f(z)| on the curve y. Then f(z) and f(z) + g(z) have the 
same number of zeros inside the interior domain delimited by y. 


An intuitive way to visualize Rouché’s Theorem is as follows: since |g| < |f|, then 
f(y) and (f + g)(7) must have the same winding number. 
> IV.35. Proof of Rouché’s theorem. Under the hypothesis of Rouché’s theorem, for0 <t <1 


h(z) = f(z) +tg(z) is such that N(h; y) is both an integer and an analytic, hence continuous, 
function of ¢ in the given range. The conclusion of the theorem follows. dq 


> IV.36. The Fundamental Theorem of Algebra. Every complex polynomial p(z) of degree n 
has exactly n roots. A proof follows by Rouché’s theorem from the fact that, for large enough 
|z| = R, the polynomial assumed to be monic is a “perturbation” of its leading term, 2”. =<] 
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p 0.4 0.6 0.8 1 1.2 1.4 1.6 1B 


& [> 
Ors ce 


4 


I 
j = 1,2,3,4, demonstrate that P1(z) has no zero inside |z| < 0.4, one zero inside 


|z| < 0.8, two zeros inside |z| < 1.2 and four zeros inside |z| < 1.6. The actual zeros are 
at pa = 0.54368, 1 and 1.11514 + 0.77184:. 


FIGURE IV.11. The transforms of y; = {|z| = ay by Pa(z) = 1 — 22 4 2%, for 


> IV.37. Symmetric function of the zeros. Let S;(f;y) be the sum of the kth powers of the 
roots of equation f(z) = 0 inside y. One has 


Sk(fsy) = = / ct z* dz, 


by a variant of the proof of the Argument Principle. <q 


These principles form the basis of numerical algorithms for locating zeros of ana- 
lytic functions, in particular the ones closest to the origin, which are of most interest to 
us. One can start from an initially large domain and recursively subdivide it until roots 
have been isolated with enough precision—the number of roots in a subdomain being 
at each stage determined by numerical integration; see Figure 11 and refer for instance 
to [117] for a discussion. Such algorithms even acquire the status of full proofs if one 
operates with guaranteed precision routines (using, e.g., careful implementations of 
interval arithmetics). 


IV. 6.3. Patterns in words: a case-study. Analysing the coefficients of a single 
generating function that is rational is a simple task, often even bordering on the trivial, 
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Length (k) Types c(z) p 
k=3 aab, abb, bba, baa 1 0.61803 
aba, bab b+ 0.56984 
aaa, bbb l+z+2 0.54368 
k=4 aaab, aabb, abbb, 
bbba, bbaa, baaa 1 0.54368 
aaba, abba, abaa, 
bbab, baab, babb 142? 0.53568 
abab, baba 1+2? 0.53101 
aaaa, bbbb l+z+2?+4+2° 0.51879 


FIGURE IV.12. Patterns of length 3, 4: autocorrelation polynomial and dominant poles 
of S(z). 


granted the exponential-polynomial formula for coefficients (Theorem IV.9). How- 
ever, in analytic combinatorics, we are often confronted with problems that involve 
an infinite family of functions. In that case, Rouché’s Theorem and the Argument 
Principle provide decisive tools for localizing poles, while Theorems IV.3 (Residue 
Theorem) and IV.10 (Expansion of meromorphic functions) serve to determine effec- 
tive error terms. An illustration of this situation is the analysis of patterns in words for 
which GFs have been derived in Chapters I (p. 50) and III (p. 200). 


All patterns are not born equal. Surprisingly, in a random sequence of coin toss- 
ings, the pattern HTT is likely to occur much sooner (after 8 tosses on average) than the 
pattern HHH (needing 14 tosses on average); see the preliminary discussion in Exam- 
ple 1.12 (p. 56). Questions of this sort are of obvious interest in the statistical analysis 
of genetic sequences [338, 491]. Say you discover that a sequence of length 100,000 
on the four letters A, G, C, T contains the pattern TACTAC twice. Can this be assigned 
to chance or is this likely to be a meaningful signal of some yet unknown structure? 
The difficulty here lies in quantifying precisely where the asymptotic regime starts, 
since, by Borges’s Theorem (Note 1.32, p. 58), sufficiently long texts will almost cer- 
tainly contain any fixed pattern. The analysis of rational generating functions sup- 
plemented by Rouché’s theorem provides definite answers to such questions, under 
Bernoulli models at least. 

We consider here the class W of words over an alphabet A of cardinality m > 2. 
A pattern p of some length k is given. As seen in Chapters I and III, its autocorre- 
lation polynomial is central to enumeration. This polynomial is defined as c(z) = 
Se cz, where c; is 1 if p coincides with its jth shifted version and 0 otherwise. 
We consider here the enumeration of words containing the pattern p at least once, and 
dually of words excluding the pattern p. In other words, we look at problems such as: 
What is the probability that a random text of length n does (or does not) contain your 
name as a block of consecutive letters? 

The OGF of the class of words excluding p is, we recall, 


(42) 5) = 35 ma) at on 
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FIGURE IV.13. Complex zeros of z*' + (1 — 2z)c(z) represented as joined by a polyg- 
onal line: (left) correlated pattern a(ba)'°; (right) uncorrelated pattern a(ab)’». 


(Proposition 1.4, p. 57), and we shall start with the case m = 2 of a binary alphabet. 
The function S(z) is simply a rational function, but the location and nature of its poles 
is yet unknown. We only know a priori that it should have a pole in the positive inter- 
val somewhere between 4 and 1 (by Pringsheim’s Theorem and since its coefficients 
are in the interval [1, 2”], for n large enough). Figure 12 gives a small list, for patterns 
of length k = 3, 4, of the pole p of S(z) that is nearest to the origin. Inspection of the 
figure suggests p to be close to 4 as soon as the pattern is long enough. We are going 
to prove this fact, based on Rouché’s Theorem applied to the denominator of (42). 

As regards termwise domination of coefficients, the autocorrelation polynomial 
lies between 1 (for less correlated patterns like aaa...b)and1+2z+---+ gh 
(for the special case aaa. . .a). We set aside the special case of p having only equal 
letters, i.e., a “maximal” autocorrelation polynomial—this case is discussed at length 
in the next chapter. Thus, in this scenario, the autocorrelation polynomial starts as 
1+2'+--- for some ¢ > 2. Fix the number A = 0.6. On |z| = A, we have 


(43) |c(z)| = |1—- (47+ 4? +--+] = 


In addition, the quantity (1 — 2z) ranges over the circle of diameter [—0.2, 1.2] as 
z varies along |z| = A, so that |1 — 2z| > 0.2. All in all, we have found that, for 
Jz] =A, 

|(1 — 2z)c(z)| > 0.02. 


On the other hand, for k > 7, we have |z*| < 0.017 on the circle |z| = A. Then, 
amongst the two terms composing the denominator of (42), the first is strictly dom- 
inated by the second along |z| = A. By virtue of Rouché’s Theorem, the number 
of roots of the denominator inside |z| < A is then same as the number of roots of 
(1 — 2z)c(z). The latter number is 1 (due to the root 4) since c(z) cannot be 0 by the 
argument of (43). Figure 13 exemplifies the extremely well-behaved characters of the 
complex zeros. 
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In summary, we have found that for all patterns with at least two different letters 
(€ > 2) and length k > 8, the denominator has a unique root in |z| < A = 0.6. 
The property for lengths & satisfying 4 < k < 7 is then easily verified directly. The 
case £ = 1 where we are dealing with long runs of identical letters can be subjected 
to an entirely similar argument (see also Example V.2, p. 285, for details). Therefore, 
unicity of a simple pole p of S(z) in the interval (0.5, 0.6) is granted. 

It is then a simple matter to determine the local expansion of s(z) near z = p, 


A x clo) 

zp p—z° © 2e(p) — (1 — 2p)e’(p) — kpko?” 

from which a precise estimate for coefficients derives by Theorems IV.9 and IV.10. 
The computation finally extends almost verbatim to nonbinary alphabets, with p 

being now close to 4. It suffices to use the disc of radius A = 1.2/m. The Rouché 

part of the argument grants us unicity of the dominant pole in the interval (1/m, A) 

for k > 5 when m = 3, and for k > 4 and any m > 4. (The remaining cases are 

easily checked individually.) 


S(z) 


Proposition IV.4. Consider an m-ary alphabet. Let p be a fixed pattern of length k > 
4, with autocorrelation polynomial c(z). Then the probability that a random word of 
length n does not contain $ as a pattern (a block of consecutive letters) satisfies 


(44) Py,, (p does not occur) = Ay(mp)~"~' +O ((3)") ; 


where p = pp is the unique root in (+, =%) of the equation z* + (1 — mz)c(z) = 0 
and Ny := me(p)/(me(p) — '(p)(1 — mp) — kok). 

Despite their austere appearance, these formule have indeed a fairly concrete 
content. First, the equation satisfied by p can be put under the form mz = 1+-z*/c(z), 
and, since p is close to a, we may expect the approximation (remember the use of 


on? 


~~” as meaning “numerically approximately equal’’) 


1 
ym 
where y := c(m7') satisfies 1 < y < m/(m-— 1). By similar principles, the 
probabilities in (44) should be approximately 


=, 
Pyy,, (p does not occur) = (1 + —) meni) 
ym 
For a binary alphabet, this tells us that the occurrence of a pattern of length & starts 
becoming likely when n is of the order of 2”, that is, when k is of the order of logs n. 
The more precise moment when this happens must depend (via 7) on the autocorrela- 
tion of the pattern, with strongly correlated patterns having a tendency to occur a little 
late. (This vastly generalizes our empirical observations of Chapter I.) However, the 
mean number of occurrences of a pattern in a text of length n does not depend on the 
shape of the pattern. The apparent paradox is easily resolved: correlated patterns tend 
to occur late, while being prone to appear in clusters. For instance, the “late” pattern 
aaa, when it occurs, still has probability s to occur at the next position as well and 
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cash in another occurrence; in contrast no such possibility is available to the “early” 
uncorrelated pattern aab, whose occurrences must be somewhat spread out. 

Such analyses are important as they can be used to develop a precise understand- 
ing of the behaviour of data compression algorithms (the Lempel—Ziv scheme); see 
Julien Fayolle’s contribution [160] for details. 
> IV.38. Multiple pattern occurrences. A similar analysis applies to the generating func- 


tion S“) (z) of words containing a fixed number s of occurrences of a pattern p. The OGF is 
obtained by expanding (with respect to u) the BGF W (z, uw) obtained in Chapter III by means 
of an inclusion-exclusion argument. For s > 1, one finds 


N(z)*" 

(s) ~igk es ke _ ok 

Sa) =z Da) D(z) = 2°+(1-—mz)c(z),  N(z) = 2°+(1—mz)(ce(z)—-1)), 
which now has a pole of multiplicity s + 1 at z = p. J 


> IV.39. Patterns in Bernoulli sequences—asymptotics. Similar results hold when letters are 
assigned nonuniform probabilities, p; = P(a;), for a; € A. The weighted autocorrelation 
polynomial is then defined by protrusions, as in Note II.38 (p. 202). Multiple pattern occur- 
rences can be also analysed. dq 


IV.7. Singularities and functional equations 


In the various combinatorial examples discussed so far in this chapter, we have 
been dealing with functions that are given by explicit expressions. Such situations 
essentially cover nonrecursive structures as well as the very simplest recursive ones, 
like Catalan or Motzkin trees, whose generating functions are expressible in terms of 
radicals. In fact, as will shall see extensively in this book, complex analytic methods 
are instrumental in analysing coefficients of functions implicitly specified by func- 
tional equations. In other words: the nature of a functional equation can often provide 
information regarding the singularities of its solution. Chapter V will illustrate this 
philosophy in the case of rational functions defined by systems of positive equations; 
a very large number of examples will then be given in Chapters VI and VII, where 
singularities much more general than poles are treated. 

In this section, we discuss three representative functional equations, 


1 
faz, fe)=zt+ fe? +2), fl) = la 
1 2f() 
that illustrate the use of fundamental inversion or iteration properties to locate domi- 
nant singularities and derive exponential growth estimates for coefficients. 


IV. 7.1. Inverse functions. We start with a generic problem: given a function ~ 
analytic at a point yo with zo = (yo) what can be said about its inverse, namely the 
solution(s) to the equation w(y) = z when z is near zo and y near yo? 

Let us examine what happens when w’(yo) 4 0, first without paying attention to 
analytic rigour. One has locally (‘~’ means as usual ‘approximately equal’) 


(45) Uy) © Y(yo) + ¥' (yo) (y — Yo), 


so that the equation w(y) = z should admit, for z near zo, a solution satisfying 


1 
46 wy ep yy 
( ) y Yo + Ww’ (yo) (z z0) 
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If this is granted, the solution being locally linear, it is differentiable, hence analytic. 
The Analytic Inversion Lemma’ provides a firm foundation for this calculation. 


Lemma IV.2 (Analytic Inversion). Let w(z) be analytic at yo, with w(yo) = Zo. 
Assume that w'(yo) 4 0. Then, for z in some small neighbourhood Qo of zo, there 
exists an analytic function y(z) that solves the equation w(y) = z and is such that 
y(Z0) = Yo- 

PROOF. (Sketch) The proof involves ideas analogous to those used to establish Rouché’s 
Theorem and the Argument Principle (see especially the argument justifying Equa- 
tion (41), p. 256) As a preliminary step, define the integrals (7 € Z>0) 


is 1 w'(y) j 
where 7+ is a small enough circle centred at yo in the y-plane. 

First consider op. This function satisfies 7o(zo) = 1 [by the Residue Theorem] 
and is a continuous function of z whose value can only be an integer, this value being 
the number of roots of the equation ¢(y) = z. Thus, for z close enough to zo, one 
must have o9(z) = 1. In other words, the equation ¢)(y) = z has exactly one solution, 
the function 7 is locally invertible and a solution y = y(z) that satisfies y(zo) = yo is 
well-defined. 

Next examine 0}. By the Residue Theorem once more, the integral defining 0; (z) 
is the sum of the roots of the equation 7)(y) = z that lie inside 7, that is, in our case, 
the value of y(z) itself. (This is also a particular case of Note 37.) Thus, one has 
o1(z) = y(z). Since the integral defining 01 (z) depends analytically on z for z close 
enough to Zo, analyticity of y(z) results. 


> IV.40. Details. Let 7) be analytic in an open disc D centred at yo. Then, there exists a small 
circle + centred at yo and contained in D such that w(y) 4 yo ony. [Zeros of analytic functions 
are isolated, a fact that results from the definition of an analytic expansion]. The integrals 0; (z 
are thus well defined for z restricted to be close enough to zo, which ensures that there exists 
a 6 > 0 such that |¢(y) — z| > 6 for all y € 7. One can then expand the integrand as a 
power series in (z — zo), integrate the expansion termwise, and form in this way the analytic 
expansions of 09, 01 at zo. [This line of proof follows [269, I, §9.4].] 


> IV.41. Inversion and majorant series. The process corresponding to (45) and (46) can be 
transformed into a sound proof: first derive a formal power series solution, then verify that the 
formal solution is locally convergent using the method of majorant series (p. 236). <q 

The Analytic Inversion Lemma states the following: An analytic function locally 
admits an analytic inverse near any point where its first derivative is nonzero. How- 
ever, aS we See next, a function cannot be analytically inverted in a neighbourhood of 
a point where its first derivative vanishes. 

Consider now a function ¢(y) such that ~’ (yo) = 0 but 7)” (yo) 4 0, then, by the 
Taylor expansion of 7, one expects 


(48) YW) © (yo) + 5(Y — wo)?" (40 


124 more general statement and several proof techniques are also discussed in APPENDIX B: Implicit 
Function Theorem, p. 698. 
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Solving formally for y now indicates a locally quadratic dependency 


Fay 


and the inversion problem admits two solutions satisfying 


2 
49 ~~ a = % 
(49) Y*Yox+4! wap)” 20 


What this informal argument suggests is that the solutions have a singularity at zo, and, 
in order for them to be suitably specified, one must somehow restrict their domain of 
definition: the case of \/Z (the root(s) of y? — z = 0) discussed on p. 217 is typical. 

Given some point 2 and a neighbourhood 2, the slit neighbourhood along direc- 
tion 6, is the set 


ON := {zen | arg(z — 20) #9 mod 27}. 
We state: 


Lemma IV.3 (Singular Inversion). Let w(y) be analytic at yo, with W(yo) = 20. 
Assume that "(yo) = 0 and w" (yo) 4 0. There exists a small neighbourhood Qo 
such that the following holds: for any direction 0, there exist two functions, y1(z) 
and y2(z) defined on an? that satisfy w(y(z)) = z; each is analytic in On; has a 
singularity at the point zo, and satisfies limz—.., y(Z) = Yo- 

PROOF. (Sketch) Define the functions o;(z) as in the proof of the previous lemma, 
Equation (47). One now has oo(z) = 2, that is, the equation w(y) = z possesses two 
roots near yo, when z is near zo. In other words ~ effects a double covering of a small 
neighbourhood 22 of yo onto the image neighbourhood 09 = ¢)(Q) 3 zo. By possibly 
restricting 2, we may furthermore assume that w’(y) only vanishes at yo in Q (zeros 
of analytic functions are isolated) and that 2 is simply connected. 

Fix any direction @ and consider the slit neighbourhood an, Fix a point ¢ in 
this slit domain; it has two preimages, 71,72 € {. Pick up the one named 77;. Since 
w' (m1) is nonzero, the Analytic Inversion lemma applies: there is a local analytic 
inverse y;(z) of 7. This y;(z) can then be uniquely continued"? to the whole of n°, 
and similarly for y2(z). We have thus obtained two distinct analytic inverses. 

Assume a contrario that y1(z) can be analytically continued at zo. It would then 
admit a local expansion 

yi(z) = >> calz — 20)”; 
n>0 
while satisfying w(yi(z)) = z. But then, composing the expansions of w and y would 
entail 


W(y1(z)) = 20 + O ((z — 20)*) (2— 2p), 


'3The fact of slitting Qo makes the resulting domain simply connected, so that analytic continuation 
becomes uniquely defined. In contrast, the punctured domain Qo \ {zo} is not simply connected, so that 
the argument cannot be applied to it. As a matter of fact, y1(z) gets continued to y2(z), when the ray of 
angle 6 is crossed: the point zo where two determinations meet is a branch point. 
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which cannot coincide with the identity function (z). A contradiction has been reached. 
The point Zp is thus a singular point for y; (as well as for yz). 


> IV.42. Singular inversion and majorant series. In a way that parallels Note 41, the process 
summarized by Equations (48) and (49) can be justified by the method of majorant series, which 
leads to an alternative proof of the Singular Inversion Lemma. dq 


> IV.43. Higher order branch points. If all derivatives of ~ till order r — 1 inclusive vanish 
at yo, there are r inverses, y1(z),..., yr(z), defined over a slit neighbourhood of zo. <J 


Tree enumeration. We can now consider the problem of obtaining information 
on the coefficients of a function y(z) defined by an implicit equation 


(50) y(z) = z6(y(2)), 


when ¢(u) is analytic at u = 0. In order for the problem to be well-posed (alge- 
braically, in terms of formal power series, as well as analytically, near the origin), we 
assume that (0) 4 0. Equation (50) may then be rephrased as 


(51) wy(z)) =z where b(u) = —~ 


so that it is in fact an instance of the inversion problem for analytic functions. 

Equation (50) occurs in the counting of various types of trees, as seen in Subsec- 
tions 1.5.1 (p. 61), II. 5.1 (p. 116), and III. 6.2 (p. 182). A typical case is d(u) = e%, 
which corresponds to labelled nonplane trees, known as Cayley trees. The function 
¢(u) = (14+ u)? is associated to unlabelled plane binary trees and ¢(u) = 1+ u+u? 
to unary—binary trees (Motzkin trees). A full analysis was developed by Meir and 
Moon [356], themselves elaborating on earlier ideas of Pélya [395, 397] and Ot- 
ter [382]. In all these cases, the exponential growth rate of the number of trees can be 
automatically determined. 


Proposition IV.5. Let ¢ be a function analytic at 0, having nonnegative Taylor coef- 
ficients, and such that 6(0) 4 0. Let R < +00 be the radius of convergence of the 
series representing ¢ at 0. Under the condition, 


tg! (x) 
(52) lim > 1, 
rk 6) 
there exists a unique solution 7 € (0, R) of the characteristic equation, 
To (7) 
(53) =1. 
o(7) 


Then, the formal solution y(z) of the equation y(z) = z(y(z)) is analytic at 0 and 
its coefficients satisfy the exponential growth formula: 


1)" ug 1 
[z”] y(z) x (<) where p= —~ = ——. 
p o(T) (7) 
Note that condition (52) is automatically realized as soon as @(R~) = +00, which 
covers our earlier examples as well as all the cases where ¢ is an entire function (e.g., 
a polynomial). Figure 14 displays graphs of functions on the real line associated to a 
typical inversion problem, that of Cayley trees, where ¢(u) = e“. 
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FIGURE IV.14. Singularities of inverse functions: ¢(u) = e” (left); (u) = u/d(u) 
(middle); y = Inv(w) (right). 


PROOF. By Note 44 below, the function x’ (x)/¢(a) is an increasing function of x 
for x € (0, R). Condition (52) thus guarantees the existence and unicity of a solution 
of the characteristic equation. (Alternatively, rewrite the characteristic equation as 
do = b2T7 + 26373 +--+, where the right side is clearly an increasing function.) 

Next, we observe that the equation y = z¢(y) admits a unique formal power 
series solution, which furthermore has nonnegative coefficients. (This solution can for 
instance be built by the method of indeterminate coefficients.) The Analytic Inversion 
Lemma (Lemma IV.2) then implies that this formal solution represents a function, 
y(z), that is analytic at 0, where it satisfies y(0) = 0. 

Now comes the hunt for singularities and, by Pringsheim’s Theorem, one may 
restrict attention to the positive real axis. Let r < +00 be the radius of convergence 
of y(z) at 0 and set y(r) := lim,_,,- y(x), which is well defined (though possibly 
infinite), given positivity of coefficients. Our goal is to prove that y(r) = T. 


— Assume a contrario that y(r) < 7. One would then have ~)’(y(r)) 4 0. By 
the Analytic Inversion Lemma, y(z) would be analytic at r, a contradiction. 

— Assume a contrario that y(r) > 7. There would then exist r* € (0,7) such 
that w’(y(r*)) = 0. But then y would be singular at r*, by the Singular 
Inversion Lemma, also a contradiction. 


Thus, one has y(r) = 7, which is finite. Finally, since y and ~ are inverse functions, 
one must have 
PST Orel ='0, 

by continuity as x — r~, which completes the proof. 

Proposition IV.5 thus yields an algorithm that produces the exponential growth 
rate associated to tree functions. This rate is itself invariably a computable number as 
soon as @ is computable (i.e., its sequence of coefficients is computable). This com- 
putability result complements Theorem IV.8 which is relative to nonrecursive struc- 
tures only. 

As an example of application of Proposition IV.5, general Catalan trees corre- 
spond to ¢(y) = (1— y)~+, whose radius of convergence is R = 1. The characteristic 


equation is r/(1 — rT) = 1, which implies r = 4 and p = 4. We obtain (not a 
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Type 
binary tree 
Motzkin tree 


gen. Catalan tree 


Cayley tree 


FIGURE IV.15. Exponential growth for classical tree families. 


suprise!) yp, > 4”, a weak asymptotic formula for the Catalan numbers. Similarly, 
for Cayley trees, (wu) = e“ and R = +00. The characteristic equation reduces to 
(rt — 1)e™ = 0, so that r = 1 and p = e~!, giving a weak form of Stirling’s formula: 
[z"]y(z) = — > e”. Figure 15 summarizes the application of the method to a few 
already encountered tree families. 

As our previous discussion suggests, the dominant singularity of tree generating 
functions is, under mild conditions, of the square-root type. Such a singular behaviour 
can then be analysed by the methods of Chapter VI: the coefficients admit an asymp- 
totic form 


[z"Jy(z)~C- pn ??, 
with a subexponential factor of the form n~°/?; see Section VI.7, p. 385. 


> IV.44. Convexity of GFs, Boltzmann models, and the Variance Lemma. Let $(z) be a non- 
constant analytic function with nonnegative coefficients and a nonzero radius of convergence R, 
such that (0) 4 0. For x € (0, R) a parameter, define the Boltzmann random variable = (of 
parameter x) by the property 


G4) P(E=n) = ae with E(s=) = £8") 


the probability generating function of =. By differentiation, the first two moments of © are 


E(2) = ro’ (a) E(2’) = zt ob" (a) rg’ (a) 


o(x) ’ (x) (x) © 
There results, for any nonconstant GF 4, the general convexity inequality valid for0 < « < R: 
d_(x¢'(a) 
(55) oe ( re) > 0, 


due to the fact that the variance of a nondegenerate random variable is always positive. Equiv- 
alently, the function log(¢(e*)) is convex for t € (—oo, log R). (In statistical physics, a Boltz- 
mann model (of parameter x) corresponds to a class ® (with OGF ¢) from which elements 
are drawn according to the size distribution (54). An alternative derivation of (55) is given in 
Note VIIL4, p. 516.) 


> IV.45. A variant form of the inversion problem. Consider the equation y = z+¢(y), where 
is assumed to have nonegative coefficients and be entire, with ¢(u) = O(u*) at u = 0. This 
corresponds to a simple variety of trees in which trees are counted by the number of their leaves 
only. For instance, we have already encountered labelled hierarchies (phylogenetic trees in 
Section II. 5, p. 119) corresponding to ¢(u) = e*—1—1, which gives rise to one of “Schréder’s 
problems”. Let 7 be the root of ¢’(7) = 1 and set p = tT — ¢(r). Then [z”]y(z) & p~”. For 
the EGF L of labelled hierarchies (L = z + e* — 1 — L), this gives Ln, /n! > (2log2—1)~”. 
(Observe that Lagrange inversion also provides [z”]y(z) = +[w”~'](1—y~'@(y))~") <1 


~ 71 
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IV.7.2. Iteration. The study of iteration of analytic functions was launched by 
Fatou and Julia in the first half of the twentieth century. Our reader is certainly aware 
of the beautiful images associated with the name of Mandelbrot whose works have 
triggered renewed interest in these questions now classified as resorting to the field 
of “complex dynamics” [24, 122, 362, 387]. In particular, the sets that appear in this 
context are often of a fractal nature. Mathematical objects of this sort are occasionally 
encountered in analytic combinatorics. We present here the first steps of a classic 
analysis of balanced trees published by Odlyzko [375] in 1982. 


Consider the class € of balanced 2-3 trees defined as trees whose node degrees 
are restricted to the set {0, 2,3}, with the additional property that all leaves are at the 
same distance from the root (Note 57, p. 83). We adopt as notion of size the number 
of leaves (also called external nodes), the list of all 4 trees of size 8 being: 


IO IS IS INN 


Given an existing tree, a new tree is obtained by substituting in all possible ways to 
each external node (C) either a pair (0, 0) or a triple (0, 0,0), and symbolically, 
one has 


€[o] = +e (EIEN se . 
In accordance with the specification, the OGF of € satisfies the functional equation 
(56) E(z) = 2+ E(z? +29), 


corresponding to the seemingly innocuous recurrence 


. k; 
E, =) E ith Ey =0, £; = 1. 
ran (a k wi 0 1 


Let o(z) = 2? + 23. Equation (56) can be expanded by iteration in the ring of 
formal power series, 


(57) E(z) =z+o(z) +oPl(z) + oFl(z) +---, 


where o!/l(z) denotes the jth iterate of the polynomial a: o!°I(z) = z, ol’*+4I(z) = 
ol (o(z)) = o(o!"l(z)). Thus, E(z) is nothing but the sum of all iterates of o. 
The problem is to determine the radius of convergence of F(z), and by Pringsheim’s 
theorem, the quest for dominant singularities can be limited to the positive real line. 
For z > 0, the polynomial o(z) has a unique fixed point, p = a(p), at 
1 1+ 75 
p= - where p= 

) 2 
is the golden ratio. Also, for any positive x satisfying x <p, the iterates ol] (x) 
do converge to 0; see Figure 16. Furthermore, since o(z) ~ z? near O, these iterates 
converge to 0 doubly exponentially fast (Note 46). By the triangle inequality, |a(z)| < 
o(|z|), the sum in (57) is a normally converging sum of analytic functions, and is thus 
itself analytic. Consequently F(z) is analytic in the whole of the open disk |z| < p. 
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ro = 0.6 

21 =0.576 

x2 = 0.522878976 

x3 = 0.416358802 

v4 = 0.245532388 

5 = 0.075088357 

x6 = 0.006061629 

£7 = 0.000036966 

xg = 0.000000001 

x9 = 1.867434390 x 10718 
210 = 3.487311201 x 107%° 


FIGURE IV.16. The iterates of a point xo € (0, ae here x9 = 0.6, by o(z) = 27 + 23 
converge fast to 0. 


It remains to prove that the radius of convergence of F(z) is exactly equal to p. 
To that purpose it suffices to observe that F(z), as given by (57), satisfies 


E(x) — +00 as Lop. 


Let N be an arbitrarily large but fixed integer. It is possible to select a positive xy 
sufficiently close to p with xy <p, such that the Nth iterate ol] (xn) is larger than 
4 (the function ol] (x) admits p as a fixed point and it is continuous and increasing at 
p). Given the sum expression (57), this entails the lower bound E(x) > x for such 
an zy <p. Thus E(x) is unbounded as x — p~ and pis a singularity. 

The dominant positive real singularity of E(z) is thus p = y+, and the Expo- 
nential Growth Formula gives: 


Proposition IV.6. The number of balanced 2-3 trees satisfies: 


1+V5\n 
Dae 

It is notable that this estimate could be established so simply by a purely qualita- 
tive examination of the basic functional equation and of a fixed point of the associated 
iteration scheme. 

The complete asymptotic analysis of the #,, requires the full power of singular- 
ity analysis methods to be developed in Chapter VI. Equation (59) below states the 
end result, which involves fluctuations that are clearly visible on Figure 17. There is 
overconvergence of the representation (57), that is, convergence in certain domains 
beyond the disc of convergence of F(z). Figure 17 displays the domain of analyticity 
of F(z) and reveals its fractal nature. 


(58) [2”] E(z) vs ( 


> IV.46. Quadratic convergence. First, for x € [0, 4], one has a(x) < 327, so that oN (2) < 
(3/2)? ~* a”. Second, for x € [0, A], where A is any number < p, there is a number k4 such 


that o!*4l(x) < 4, so that o!*l(x) < (3/2) (3/4)? “4 . Thus, for any A < p, the series of 
iterates of o is quadratically convergent when z € [0, A]. dq 
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FIGURE IV.17. Left: the circle of convergence of E(z) and its fractal domain of ana- 
lyticity (in gray with darker areas representing slower convergence of iterates of o). Right: 
the ratio E,/(y”"n') plotted against log n for n = 1..500 confirms that E, > y" and 
illustrates the periodic fluctuations expressed by Equation (59). 


> IV.47. The asymptotic number of 2-3 trees. This analysis is from [375, 377]. The number of 
2-3 trees satisfies asymptotically 


(59) E, = £-A(logn) + O (5) 
n nr 


where Q is a periodic function with mean value (ylog(4 — y))~* + 0.71208 and period 
log(4 — y) = 0.86792. Thus oscillations are inherent in E,. A plot of the ratio E,, /(p" /n) is 
offered in Figure 17. <q 


IV. 7.3. Complete asymptotics of a functional equation. George Pélya (1887-— 
1985) is mostly remembered by combinatorialists for being at the origin of Pélya 
theory, a branch of combinatorics that deals with the enumeration of objects invariant 
under symmetry groups. However, in his classic article [395, 397] which founded 
this theory, Pélya discovered at the same time a number of startling applications of 
complex analysis to asymptotic enumeration!* . We detail one of these now. 

The combinatorial problem of interest here is the determination of the number M,, 
of chemical isomeres of alcohols C,,H2n4,OH without asymmetric carbon atoms. 
The OGF M(z) = >¢,, M,z” that starts as (EIS 4000621) 


(60) M(z)=1+24+ 274 223 + 324 +527 + 829 + 1427 + 2328 + 3929 +---, 


is accessible through a functional equation: 


1 


(61) M(z) = PTC 


Iq many ways, Polya can be regarded as the grand father of the field of analytic combinatorics. 
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Iteration of the functional equation leads to a continued fraction representation, 


from which Pélya found: 


Proposition IV.7. Let M(z) be the solution analytic around 0 of the functional equa- 
tion 


2 1 
~ 1— 2M (22) 
Then, there exist constants K, 3, and B > 1, such that 
M, =K-p” (1 + O(B~")) , 6 = 1.68136 75244, K = 0.36071 40971. 


PROOF. We offer two proofs. The first one is based on direct consideration of the 
functional equation and is of a fair degree of applicability. The second one, following 
Pélya, makes explicit a special linear structure present in the problem. As suggested 
by the main estimate, the dominant singularity of 1/(z) is a simple pole. 


First proof. By positivity of the functional equation, M/(z) dominates coef- 
ficientwise any GF (1 — zM<™(z?))~*, where M<™(z) := SocjemMnz” is 
the mth truncation of M(z). In particular, one has the domination relation (use 
M<?(z)=14+2) 

1 
M(z) = ——— . 
(2)= l—z-23 
Since the rational fraction has its dominant pole at z = 0.68232, this implies that 
the radius p of convergence of M(z) satisfies p < 0.69. In the other direction, since 
M(z?) < M(z) for z € (0, p), then, one has the numerical inequality 
ey 1 
~ 1-2M(z)’ 
This can be used to show (Note 48) that the Catalan generating function C(z) = (1 — 
V1 — 4z)/(2z) is a majorant of M(z) on the interval (0, +), which implies that M (z) 
is well defined and analytic for z € (0, +). In other words, one has + <p < 0.69. 
Altogether, the radius of convergence of M lies strictly between 0 and 1. 


> IV.48. Alcohols, trees, and bootstrapping. Since M(z) starts as 1 + z+ 22 +--+ while 
C(z) starts as 1 + z +22? 4+---, there is a small interval (0,¢€) such that M(z) < C(z). By 
the functional equation of M(z), one has M(z) < C(z) for z in the larger interval (0, \/e). 
Bootstrapping then shows that M(z) < C(z) for z € (0, 4). dq 


M(z) O0<z<p. 


Next, as z — p~, one must have zM(z?) — 1. (Indeed, if this was not the 
case, we would have 2M(z?) < A < 1 for some A. But then, since p? < p, the 
quantity (1 — zM(z?))~! would be analytic at z = p, a clear contradiction.) Thus, p 
is determined implicitly by the equation 


pM(p?) = 1, O<p<l. 
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One can estimate p numerically (Note 49), and the statement follows with G = 1/p. 
(Pélya determined p to five decimals by hand!) 

The previous discussion also implies that p is a pole of M(z), which must be 
simple (since 0. (2M(z?)|,_, > 0). Thus 


1 1 


@) MG). Ki OK = Saya 


The argument shows at the same time that M(z) is meromorphic in |z| < \/p = 0.77. 
That p is the only pole of M(z) on |z| = p results from the fact that zM(z?) = 
z+z3+.--- can be subjected to the type of argument encountered in the context of the 
Daffodil Lemma (see the discussion of quasi-inverses in the proof of Proposition IV.3, 


p. 253). The translation of the singular expansion (62) then yields the statement. 

> IV.49. The growth constant of molecules. The quantity p can be obtained as the limit of 
the pm satisfying 7") Mnpz** = 1, together with p € [4, 0.69]. In each case, only a 
few of the M/,, (provided by the functional equation) are needed. One obtains: p19 = 0.595, 
p20 = 0.594756, p30 = 0.59475397, pao = 0.594753964. This algorithms constitutes a 
geometrically convergent scheme with limit p = 0.59475 39639. dq 


Second proof. First, a sequence of formal approximants follows from (61) starting 
with 


‘ 1 i _ 1-2 i S 1-27-24 
7 1-2? 7 __4 ~ La z= 22’ es eZ ~ Jazg— 272-244 25’ 
[a2 2 
Las 
l-z 


which permits us to compute any number of terms of the series /(z). Closer exami- 
nation of (61) suggests to set 


v2?) 
MO)= Tay" 
where ¢)(z) = 1—z— 2% —244+25— 284.294 210 — 216 4..., Back substitution 
into (61) yields 
we) tw) 
ve) we)” VG) ~ 0’ 
$(z*) 


which shows ~(z) to be a solution of the functional equation 
b(z) = ¥(2")-zv(z"), dO) = 1. 
The coefficients of ~ satisfy the recurrence 
Pan = Wan, Want = —Vny Wan+2 = Pan41; Pan+3 = 9, 


which implies that their values are all contained in the set {0, —1, +1}. 

Thus, M(z) appears to be the quotient of two function, w(z?) /w(z), each analytic 
in the unit disc, and M/(z) is meromorphic in the unit disc. A numerical evaluation 
then shows that y(z) has its smallest positive real zero at p = 0.59475, which is a 
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simple root. The quantity p is thus a pole of M(z) (since, numerically, w(p?) 4 0). 


Thus 
~ be?) 4 Ve? fly 
LON Geigy, 3 oe) (5) 7 


Numerical computations then yield Pélya’s estimate. Et voila! 


The example of Pélya’s alcohols is exemplary, both from a historical point of 
view and from a methodological perspective. As the first proof of Proposition IV.7 
demonstrates, quite a lot of information can be pulled out of a functional equation 
without solving it. (A similar situation will be encountered in relation to coin foun- 
tains, Example V.7, p. 307.) Here, we have made great use of the fact that if f(z) is 
analytic in |z| < r and some a priori bounds imply the strict inequalities 0 < r < 1, 
then one can regard functions like f(z), f(z), and so on, as “known” since they are 
analytic in the disc of convergence of f and even beyond, a situation also evocative 
of our earlier discussion of Pélya operators in Subsection IV. 4. Globally, the lesson 
is that functional equations, even complicated ones, can be used to bootstrap the local 
singular behaviour of solutions, and one can often do so even in the absence of any 
explicit generating function solution. The transition from singularities to coefficient 
asymptotics is then a simple jump. 
> IV.50. An arithmetic exercise. The coefficients ~, = [z”]y(z) can be characterized simply 
in terms of the binary representation of n. Find the asymptotic proportion of the 7, for n € 
[1 .. 2] that assume each of the values 0, +1, and —1. <q 


IV.8. Perspective 


In this chapter, we have started examining generating functions under a new light. 
Instead of being merely formal algebraic objects—power series—that encode ex- 
actly counting sequences, generating functions can be regarded as analytic objects— 
transformations of the complex plane—whose singularities provide a wealth of infor- 
mation concerning asymptotic properties of structures. 

Singularities provide a royal road to coefficient asymptotics. We could treat here, 
with a relatively simple apparatus, singularities that are poles. In this perspective, 
the two main statements of this chapter are the theorems relative to the expansion of 
rational and meromorphic functions, (Theorems IV.9 and IV.10). These are classical 
results of analysis. Issai Schur (1875-1941) is to be counted amongst the very first 
mathematicians who recognized their rdle in combinatorial enumerations (denumer- 
ants, Example 5, p. 244). The complex-analytic thread was developed much further by 
George Polya in his famous paper of 1937 (see [395, 397]), which Read in [397, p. 96] 
describes as a “landmark in the history of combinatorial analysis”. There, Pélya laid 
the groundwork of combinatorial chemistry, the enumeration of objects under group 
actions, as well as the complex-asymptotic theory of graphs and trees. 

The present chapter serves as the foundation stone of a rich theory to be devel- 
oped in future chapters. In particular the method of singularity analysis exposed in 
Chapter VI considerably extends the range of applicability of the Second Principle to 
functions having singularities appreciably more complicated that poles (e.g., the ones 
involving fractional powers, logarithms, iterated logarithms, and so on). 
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Basics. The theory of analytic functions benefits from the equivalence between two no- 
tions, analyticity and differentiability. It is the basis of a powerful integral calculus, much 
different from its real variable counterpart. The following two results can serve as “axioms” of 
the theory. 


THEOREM IV.1 [Basic Equivalence Theorem] (p. 219): Two fundamental notions are equiv- 
alent, namely, analyticity (defined by convergent power series) and holomorphy (defined by 
differentiability). Combinatorial generating functions, a priori determined by their expansions 
at 0 thus satisfy the rich set of properties associated with these two equivalent notions. 
THEOREM IV.2 [Null Integral Property] (p. 221): The integral of an analytic function along a 
simple loop (closed path that can be contracted to a single point) is 0. Consequently, integrals 
are largely independent of particular details of the integration contour. 

Residues. For meromorphic functions (functions with poles), residues are essential. Co- 
efficients of a function can be evaluated by means of integrals. The following two theorems 
provide connections between local properties of a function (e.g., coefficients at one point) and 
global properties of the function elsewhere (e.g., an integral along a distant curve). 


THEOREM IV.3 [Cauchy’s residue theorem] (p. 222): In the realm of meromorphic functions, 
integrals of a function can be evaluated based on local properties of the function at a few specific 
points, its poles. 
THEOREM IV.4 [Cauchy’s Coefficient Formula] (p. 224): This is an almost immediate conse- 
quence of Cauchy’s residue theorem: The coefficients of an analytic function admit of a repre- 
sentation by a contour integral. Coefficients can then be evaluated or estimated using properties 
of the function at points away from the origin. 

Singularities and growth. Singularities (places where analyticity stops), provide essential 
information on the growth rate of a function’s coefficients. The “First Principle” relates the 
exponential growth rate of coefficients to the location of singularities. 


THEOREM IV.5 [Boundary singularities] (p. 227): A function (given by its series expansion 
at 0) always has a singularity on the boundary of its disc of convergence. 
THEOREM IV.6 [Pringsheim’s Theorem] (p. 229): This theorem refines the previous one for 
functions with non-negative coefficients. It implies that, in the case of combinatorial generating 
functions, the search for a dominant singularity can be restricted to the positive real axis. 
THEOREM IV.7 [Exponential Growth Formula] (p. 231): The exponential growth rate of co- 
efficients is dictated by the /ocation of the singularities nearest to the origin—the dominant 
singularities. 
THEOREM IV.8 [Computability of growth] (p. 237): For any combinatorial class that is nonre- 
cursive (iterative), the exponential growth rate of coefficients is invariably a computable number. 
This statement can be regarded as the first general theorem of analytic combinatorics. 
Coefficient asymptotics. The “Second Principle” relates subexponential factors of coef- 
ficients to the nature of singularities. For rational and meromorphic functions, everything is 
simple. 


THEOREM IV.9 [Expansion of rational functions] (p. 243): Coefficients of rational functions 
are explicitly expressible in terms of the poles, given their location (values) and nature (multi- 
plicity). 

THEOREM IV. 10 [Expansion of meromorphic functions] (p. 245): Coefficients of meromorphic 
functions admit of a precise asymptotic form with exponentially small error terms, given the 
location and nature of the dominant poles. 


FIGURE IV.18. A summary of the main results of Chapter IV. 
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As we hope to convince our reader, a consequence of the theory developed in 
Part B is that most combinatorial classes amenable to symbolic descriptions can be 
thoroughly analysed, as regards their asymptotic properties, by means of a selected 
collection of basic theorems of complex analysis. The case of structures like balanced 
trees and molecules, where only a functional equation of sorts is available, is exem- 
plary. 

This chapter has been designed to serve as a refresher of basic complex analysis, with 
special emphasis on methods relevant for analytic combinatorics. See Figure 18 for a concise 
summary of results. References most useful for the discussion given here include the books of 
Titchmarsh [469] (oriented towards classical analysis), Whittaker and Watson [492] (stressing 
special functions), Dieudonné [129], Hille [269], and Knopp [299]. Henrici [265] presents com- 
plex analysis under the perspective of constructive and numerical methods, a highly valuable 
point of view for this book. 

De Bruijn’s classic booklet [111] is a wonderfully concrete introduction to effective as- 
ymptotic theory, and it contains many examples from discrete mathematics thoroughly worked 
out using a complex-analytic approach. The use of such analytic methods in combinatorics 
was pioneered in modern times by Bender and Odlyzko, whose first publications in this area 
go back to the 1970’s. The state of affairs in 1995 regarding analytic methods in combinatorial 
enumeration is superbly summarized in Odlyzko’s scholarly chapter [377]. Wilf devotes his 
Chapter 5 of Generatingfunctionoloy [496] to this question. The books by Hofri [270] and Sz- 
pankowski [458] contain useful accounts in the perspective of analysis of algorithms. See also 
our book [434] for a light introduction and the chapter by Vitter and Flajolet [486] for more on 
this specific topic. 


V 


Applications of Rational and 
Meromorphic Asymptotics 


Analytic methods are extremely powerful and when they apply, 
they often yield estimates of unparalleled precision. 


— ANDREW ODLYZKO [377] 
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The primary goal of this chapter is to provide combinatorial illustrations of the power 
of complex analytic methods, and specifically of the rational-meromorphic frame- 
work exposed in the previous chapter. At the same time, we shift gears and envisage 
counting problems at a new level of generality. Precisely, we organize combinatorial 
problems into wide families of combinatorial types amenable to a common treatment 
and associated with a common collection of asymptotic properties. Without attempt- 
ing a formal definition, we call schema any such family determined by combinatorial 
and analytic conditions that covers an infinity of combinatorial classes. 

The first schema comprises regular specifications and languages, which a priori 
lead to rational generating functions and thus systematically resort to Theorem IV.9 
(p. 243), to the effect that coefficients are described as exponential-polynomials. In the 
case of regular specifications, much additional structure is present, especially positiv- 
ity. As a consequence, fluctuations can be systematically circumvented. Applications 
include the analysis of longest runs, corresponding to maximal sequences of good 
(or bad) luck in games of chance, pure birth processes, and the occurrence of hidden 
patterns (subsequences) in random texts. 

We then consider an important subset of regular specifications, the ones that are 
built on nested sequences and combinatorially correspond to a variety of lattice paths. 
Such nested sequences naturally lead to nested quasi-inverses, which are none other 
than continued fractions, A wealth of combinatorial, algebraic, and analytic properties 
then surround such constructions. A prime illustration is the very explicit analysis of 
height in Dyck paths and general Catalan trees; other interesting applications relate to 
coin fountain and interconnection networks. 
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Next, we discuss a general schema of analytic combinatorics known as the su- 
percritical sequence schema, which provides a neat illustration of the power of mero- 
morphic asymptotics while being of a very wide applicability. For instance, one can 
predict very precisely (and easily) the number of ways in which an integer can be 
decomposed additively as a sum of primes (or twin primes), this even though many 
details of the distribution of primes are still surrounded in mystery. 

Finally, the last two sections examine positive linear systems of generating func- 
tions, starting with the simplest case of graphs and automata and concluding with the 
general framework of transfer matrices. Although the resulting generating functions 
are once more bound to be rational, there is benefit in examining them as defined im- 
plicitly (rather than solving explicitly) and work out singularities directly. The spec- 
trum of matrices (the set of eigenvalues) then plays a central réle. Our treatment is 
then close to the Perron-Frobenus theory of nonnegative matrices, whose importance 
has been long recognized in the theory of finite Markov chains. A general discus- 
sion of singularities can then be conducted, leading to valuable consequences on a 
variety of models—paths in graphs, finite automata, and transfer matrices. The last 
example discussed in this chapter treats locally constrained permutations, where ra- 
tional functions combined with inclusion-exclusion provide an entry to the world of 
value-constrained permutations. 


In the various combinatorial examples encountered in this chapter, the generating 
functions are generally meromorphic in some domain extending beyond their disc of 
convergence at 0. As a consequence, the asymptotic estimates of coefficients involve 
main terms that are explicit exponential polynomials and error terms that are exponen- 
tially smaller. This is a situation which is well summarized by Odlyzko’s aphorism: 
“Analytic methods [...] often yield estimates of unparalleled precision’. 


V.1. A roadmap to rational and meromorphic asymptotics 


The key character in this chapter is the combinatorial sequence construction SEQ. 
Since its translation into generating functions involves a quasi-inverse, (1 — f)~+, the 
construction should in many cases be expected to induce polar singularities. Also, 
linear systems of equations, of which the simplest case is X = 1 + AX, are solvable 
by means of inverses: the solution is X = (1— A)~1 in the scalar case, and it is other- 
wise expressible as a quotient of determinants, by Cramer’s rule, in the vectorial case. 
Consequently, linear systems of equations are also conducive to polar singularities. 

This chapter accordingly develops along two main lines. First, we study non- 
recursive families of combinatorial problems that are, in a suitable sense, driven by 
a sequence construction. Second, we examine families of recursive problems that 
are naturally described by linear systems of equations. Clearly, the general theorems 
giving the asymptotic forms of coefficients of rational and meromorphic functions 
apply. As we see here, the additional positivity structure arising from combinatorics 
often entails notable simplifications in the asymptotic form of counting sequences. 


Regular specification and languages. This topic is treated in Section V. 2. Reg- 
ular specifications are non-recursive specifications that only involve the constructions 
(+, x, SEQ). In the unlabelled case, they can always be interpreted as describing a 
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regular language in the sense of Chapter I. The main result here is the following: 
given a regular specification 7e, it is possible to determine constructively a number D, 
so that an asymptotic estimate of the form 


(1) Ry, = P(n)B”" + O(B"), 0<B<£, Papolynomial, 


holds, once the index n is restricted to a fixed congruence class modulo D. (Naturally, 
the quantities P, 3, B may depend on the particular congruence class considered.) In 
other words, a “pure” exponential polynomial form holds for each of the D sections 
of the counting sequence (R,)n>0. In particular, irregular fluctuations, which might 
otherwise arise from the existence of several dominant poles sharing the same modu- 
lus but having incommensurable arguments (see the discussion in Subsection IV. 6.1, 
p. IV. 6.1 dedicated to multiple singularities), are simply not present in regular speci- 
fications and languages. Similar estimates hold for profiles of regular specifications, 
where profile of an object is understood as the number of times any fixed construction 
is employed. 

Nested sequences, lattice paths, and continued fractions. What is considered 
here could be termed the SEQ o ---o SEQ schema, corresponding to nested sequences. 
The resulting GFs are chains of quasi-inverses, that is, continued fractions. Though 
the general theory of regular specifications applies, the additional structure resulting 
from nested sequences implies in essence uniqueness and simplicity of the dominant 
pole, resulting directly in an estimate of the form 


(2) Sn = 06" +O(B"), 0S B<B, ceERso, 


for objects enumerated by nested sequences. This schema covers lattice paths of 
bounded height, their weighted versions, as well as several other bijectively equivalent 
classes, like interconnection networks. In each case, profiles can be fully character- 
ized, the estimates being of a simple form. 


The supercritical sequence. This is a schema of the general form F = SEQ(G) 
with a simple analytic condition, “supercriticality”, attached to the generating func- 
tion G(z) of G. Under this condition, the sequence (F;,) happens to be predictable 
and an asymptotic estimate, 


(3) F,=cS"+O(T"), 0<T<S, cERso, 


applies with S such that G(1/S) = 1. Integer compositions, surjections, and align- 
ments presented in Chapters I and II can then be treated in a unified manner. The 
supercritical sequence schema even covers many situations where G is not necessarily 
constructible—this includes compositions into summands that are prime numbers or 
twin primes. Parameters, like the number of components and more generally profiles, 
are under these circumstances governed by laws that hold with a high probability. 


Paths in graphs and automata. The framework of paths in directed graphs is of 
considerable generality. In particular, it covers the case of finite automata introduced 
in Chapter I. Although, in the abstract, the descriptive power of this framework is 
formally equivalent to the one of regular specifications (APPENDIX A: Regular lan- 
guages, p. 678), there is great advantage in considering directly problems whose natu- 
ral formulation is recursive and phrased in terms of graphs or automata. (The reduction 
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of automata to regular expressions is nontrivial so that it does not tend to preserve the 
original combinatorial structure.) The algebraic theory is that of matrices of the form 
(I — zT)~+, where T is a matrix with nonnegative entries. The analytic theory be- 
hind the scene is now that of positive matrices and the companion Perron-Frobenius 
theory. Uniqueness and simplicity of dominant poles of generating functions can be 
guaranteed under easily testable structural conditions—principally, the condition of 
irreducibility that corresponds to a strong connectedness of the system. Then a pure 
exponential polynomial form of the simplest type holds, 


(4) Ch ~e-rAY + O(A"), O0<A<X1, cE Ryo, 


where A, is the (unique) dominant eigenvalue of the transition matrix T’. Applications 
include walks over various types of graphs (the interval graph, the devil’s staircase) 
and words excluding one or several patterns (walks on the De Bruijn graph). 


Transfer matrices. This framework, whose origins lie in statistical physics, is an 
extension of automata and paths in graphs. What is retained is the notion of a finite 
state system, but transitions can now take place at different speeds. Algebraically, 
one is dealing with matrices of the form (I — T(z))~!, where T is a matrix whose 
entries are polynomials (in z) with nonnegative coefficients. Perron-Frobenius theory 
can be adapted to cover such cases, that, to a probabilist, involve a mixture of Markov 
chain and renewal theory. The consequence is once more an estimate of the type (4) 
for this category of models. A striking application of transfer matrices is a study, 
with an experimental mathematics flavour, of self-avoiding walks and polygons in the 
plane: it turns out to be possible to predict, with a high degree of confidence (but 
no mathematical certainty), what the number of polygons is and which distribution 
of area is to be expected. A combination of the transfer matrix approach with a suit- 
able use of inclusion-exclusion finally provides (Subsection V. 6.4) a solution to the 
classic ménage problem of combinatorial theory as well as to many related questions 
regarding value-constrained permutations. 


Sections V.2 to V.6 are organized following a common pattern: first, we discuss 
“combinatorial aspects”, then “analytic aspects”, and finally “applications”. Each of 
Sections V.2 to V.5 is furthermore centred around two analytic-combinatorial theo- 
rems, one describing asymptotic enumeration, the other quantifying the asymptotic 
profiles of combinatorial structures. The last section (Section V. 6) departs slightly 
from this general pattern: transfer matrices are reducible rather simply to the frame- 
work of paths in graphs and automata, presented in the immediately preceding section, 
so that, in order to avoid redundancy, the corresponding theorems are not explicitly 
stated. 


V.2. Regular specification and languages 


The purpose of this section is the general study of the (+, x, SEQ) schema, which 
covers all regular specifications. As we show here, pure exponential-polynomial forms 
with a single dominating exponential can always be extracted. Theorems V.1 and V.2 
thus provide a universal framework for the asymptotic analysis of regular classes. 
Additional structural conditions to be introduced in later sections (nested sequences, 
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irreducibility of the dependency graph and of transfer matrices) will then be seen to 
induce further simplifications in asymptotic formue. 


V.2.1. Combinatorial aspects. For convenience and without loss of analytic 
generality, we consider here unlabelled structures. According to Chapter I, a com- 
binatorial specification is regular if it is nonrecursive (“iterative”) and it involves only 
the constructions of Atom, Union, Product, and Sequence. A language C is S-regular 
if it is combinatorially isomorphic to a class M described by a regular specification. 
Alternatively, a language is S-regular if all the operations involved in its descrip- 
tion (unions, catenation products and star operations) are unambiguous. See Defini- 
tion I.10 (p. 48) and the companion Proposition I.2 (p. 48). 

The dictionary translating constructions into OGFs is 


(5) F+GrF+G, FxGrFxG, SEQ(F) & (1-F)7', 
and for languages, under the essential condition of non-ambiguity, 
(6) LUM L+M, L:-M=LxM, Lee (1-L). 


The rules (5) and (6) then give rise to generating functions that are invariably rational 
functions. Consequently, given a regular class C, the exponential-polynomial form of 
coefficients expressed by Theorem IV.9 systematically applies, and one has 
(7) Cn = [2"|C(z) = ST (n)o5”, 

j=l 
for a family of algebraic numbers a, (the poles of C'(z)) and a family of polynomi- 
als IT;. 

As we know from the discussion of periodicities in Section IV.6.1 (p. 250, the 
collective behaviour of the sum in (7) depends on whether or not a single a dominates 
all others in modulus. In the case where several dominant singularities coexist, fluc- 
tuations of sorts (either periodic or irregular) may manifest themselves. In contrast, if 
a single @ dominates, then the exponential-polynomial formula acquires a transparent 
asymptotic meaning. Accordingly, we set: 


Definition V.1. An exponential-polynomial form et I1;(n)a;” is said to be pure 
if|aa| < Joy 
ones. 


, for all j > 2. In that case, a single exponential dominates all the other 


As we see next for regular languages and specifications, the corresponding count- 
ing coefficients can always be described by a finite collection of pure exponential 
polynomial forms. The fundamental reason is that we are dealing with a special sub- 
set of rational functions, one that enjoys strong positivity properties. 
> Vl. Positive rational functions. Define the class Ratt of positive rational functions as 
the smallest class containing polynomials with positive coefficients (R>o[z]) and closed under 
sum, product, and quasi-inverse, where Q(f) = (1 — f)~* is applied to elements f such that 
f(0) = 0. The OGF of any regular class with positive weights attached to neutral structures 
and atoms is in Ratt. Conversely, any function in Ratt is the OGF of a positively weighted 
regular class. The notion of a Ratt function is for instance useful in the analysis of weighted 
word models and Bernoulli trials, as discussed in Section III. 6.1, p. 178. <J 
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V.2.2. Analytic aspects. First we need the notion of sections of a sequence. 


Definition V.2. Let (f,,) be a sequence of numbers. Its section of parameters D,r, 
where D € Zyo andr € Z5¢ is the subsequence (fnp+r). The numbers D and r are 
referred to as the modulus and the base respectively. 


The main theorem describing the asymptotic behaviour of regular classes is a 
consequence of Proposition IV.3 (p. 253) and is originally due to Berstel. (See Soit- 
tola’s article [442] as well as the books by Eilenberg [149, Ch VII] and Berstel— 
Reutenauer [44] for context and proofs of some of the assertions below.) 


Theorem V.1 (Asymptotics of regular classes). Let S be aclass described by a regular 
specification. Then there exists an integer D such that each section of modulus D of 
Sy, that is not eventually 0 admits a pure exponential polynomial form: for n larger 
than some no, and any such section of base r, one has 


Sin = I(n)B” + S° Pj(n)g? n=rmod D, 
j=l 


where 3 > |G; 


PROOF. Let a, be the dominant pole of S(z) that is positive. Proposition IV.3 as- 
serts that any dominant pole, a is such that a/|a| is a root of unity. Let Do be 
such that the dominant singularities are all contained in the set {a,u/ meter where 
w = exp(2imz/Do). By collecting all contributions arising from dominant poles in 
the general expansion (7) and by restricting n to a fixed congruence class modulo Do, 


namely n = r + Dov with 0 < r < Do, one gets 
(8) SrtDov = Ul (n)ay?°” + O(A-”). 


, and II, P; are polynomials that depend on the base r, with I(x) ¥ 0. 


There II!"! is a polynomial depending on r and the remainder term represents an ex- 
ponential polynomial with growth at most O(A~") for some A > ay. 

The sections with modulus Dp that are not eventually 0 can be categorized into 
two classes. 


— Let Ryo be the set of those values of r such that Tl"! is not identically 0. 
The set Rzo is nonempty (else the radius of convergence of S(z) would be 
larger than a1.) For any base r € 7x9, the assertion of the theorem is then 
established with 8 = 1/a. 

— Let Ro be the set of those values of r such that II!"](a) = 0, with Hl"! as 
given by (8). Then one needs to examine the next layer of poles of S(z), as 
detailed below. 


Consider a number r such that r € Ro, so that the polynomial II!”) is identically 0. 
First, we isolate in the expansion of S(z) those indices that are congruent to r modulo 
Do. This is achieved by means of a Hadamard product: 


2) =5(2)0 (a). 


A classical theorem [47, 149] from the theory of positive rational functions in the 
sense of Note | asserts that such functions are closed under Hadamard product. (A 
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dedicated construction is also possible.) Then the resulting function G(z) is of the 
form 
g(z) = 27 yz), 

with the rational function 7(z) being analytic at 0. Note that we have [z”]y(z) = 
S_D +r, $0 that +7 is exactly the generating function of the section of base r of S(z). 
One verifies next that +(z), which is obtained by the substitution z 1 zi/Po in 
g(z)z~", is itself a positive rational function. Then, by a fresh application of Berstel’s 
Theorem (Proposition IV.3, p. 253), this function, if not a polynomial, has a radius of 
convergence p with all its dominant poles o being such that 7 /p is a D, root of unity 
for some D, > 1. The argument originally applied to S(z) can thus be repeated, with 
7(z) replacing S(z). In particular, one finds at least one section (of modulus D,) of 
the coefficients of 7(z) that admits a pure exponential-polynomial form. The other 
sections of modulus D, can themselves be further refined, and so on 

In other words, successive refinements of the sectioning process provide at each 
stage at least one pure exponential-polynomial form, possibly leaving a few congru- 
ence classes open for further refinements. Define the layer index of a rational function 
f as the integer «(f), such that 


n(f) = card {|¢] | f(¢) = oo}. 


(This index is thus the number of different moduli of poles of f.) It is seen that each 
successive refinement step decreases by at least 1 the layer index of the rational func- 
tion involved, thereby ensuring termination of the whole refinement process. Finally, 
the collection of the iterated sectionings obtained can be reduced to a single section- 
ing according to a common modulus D, which is the least common multiple of the 
collection of all the finite products DoD, --- that are generated by the algorithm. 


For instance the coefficients (Figure 1) of the function 
1 z 
a ea ea 
(l—z)(l- 22-24) 1-323 
associated to the regular language a* (bb + cccc)* + d(ddd + eee + f f f)*, exhibit an 
apparently irregular behaviour, with the expansion of F(z) starting as 


(9) F(z) = 


10 11 


14224227 +2234 724 442° + 72% + 1627 + 1228 +1229 4.47219 4+ 202 +... 


However the sections modulo 6 each admit a pure exponential-polynomial form and 
consequently become easy to describe. 


[> V.2. Extension to Rat* functions. The conclusions of Theorem V.1 hold for any function 
in Rat* in the sense of Note 1. <q 


> V.3. Soittola’s Theorem. This is a converse to Theorem V.1 proved in [442]. Assume that 
coefficients of an arbitrary rational function f(z) are nonnegative and that there exists a sec- 
tioning such that each section admits a pure exponential-polynomial form. Then f(z) is in 
Rat* in the sense of Note 1; in particular, f is the OGF of a (weighted) regular class. dq 

Theorem V.1 is useful for interpreting the enumeration of regular classes and 
languages. It serves a similar purpose with regards to structural parameters of regular 
classes. Consider a regular specification C augmented with a mark wu that is, as usual, 
a neutral object of size 0 (see Chapter III). We let C(z, u) be the corresponding BGF 
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FIGURE V.1. Plots of log F;, with F, = [z”]F(z) and F(z) as in (9) display fluctua- 
tions that disappear as soon as sections of modulus 6 are considered. 


of C, so that C;,, = [z"u"]C(z, u) is the number of C-objects of size n that bear k 
marks. A suitable placement of marks makes it possible to record the number of times 
any given construction enters an object. For instance, in the augmented specification 
of binary words, 


C = (SEQ<,(b) + uSEQs,(b)) SEQ(a(SEQ<,(b) + uSEQs,(0))), 


all maximal runs of b having length at least r are marked by a u. There results the 
following BGF for the corresponding parameter “number of b-runs of length > r’”, 


1-2" - 1 
GCE re (ae | 
1l-—z l-z 1-2 (452 +35) 


1l-—z 


from which mean and variance can be determined. In a sense, marks make it possible 
to analyse profile, with respect to constructions entering the specification, of a random 
object. 


Theorem V.2 (Profile of regular classes). Consider a regular specification C aug- 
mented with a mark and let x be the parameter corresponding to the number of occur- 
rences of that mark. There exists a sectioning index d such that for any fixed section 
of (C,,) of modulus d, the following hold: Any moment of integral order s > 1 of x 
satisfies an asymptotic formula 


(10) Ee, [x*] = Q(n)8" + O(G"), 


where! 0 < 8 < 1, Q(n) is a rational fraction, and G < f. 


In this statement, it is tacitly assumed that only sections that are not eventually 0 are 
considered. 


'The quantities 3, Q,G depend on the particular section considered. 
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PROOF. The case of expectations suffices to indicate the lines of a general proof. One 
possible approach’is to build a derived specification D such that 


=G> 
which is also a regular specification. To this purpose, define a transformation on 
specifications defined inductively by the rules 


O(A+ B) =0A+ 0B, O(/Ax B)=OAx B+ Ax OB, OSEQ(A) = SEQ(A) x OA x SEQ(A), 


together with the initial conditions Qu = 1 and 0Z = Q). This is a form of combina- 
torial differentiation: an object y € C corresponds to y(y) objects in D, namely, one 
for each choice of an occurrence of the mark. 

As a consequence, D,, is the cumulated value of y over C,, so that Dn/Cy = 
“¢,, |X]. On the other hand, D is a regular specification to which Theorem V.1 ap- 
plies. The result follows upon considering (if necessary) a sectioning that refines the 
sectionings of both C and D. The argument extends easily to higher moments. 


V.4. An example. Consider the regular language C = a*(b+ c)*d(b + c)*. Let x be the 
length of the initial run of a’s. Then one finds 


z 
(1 — z)(1— 2z)2’ 
Thus the mean of x satisfies 
— Dn _ (n—3)2"+(n+3)  n-3 3\" 
Ren Gana Batt MRA 


Generally, in the statement of Theorem V.2, let Q(n) = A(n)/B(n) with A, B polynomials 
and a = deg(A),b = deg(B). The following combinations prove to be possible (for first 
moments): = 1 and (a,b) any pair such that0 < a < b+1; 6 < Land (a,b) any pair of 
elements > 0. dd 


C(z) = 


> V.5. Shuffle products. Let L,M be two languages over two > disjoint alphabets. Then, the 


shuffle product S of £ and M is such that $(z) = L(z) - M(z ), where S, L, M are the 
exponential generating functions of S,£,M. Accordingly, if the OGF L(z) and M(z) ar 
rational then the OGF S(z) is also rational. [This technique may be used to analyse seat ateed 
birthday paradox and coupon collector problems; see [181].] 


V. 2.3. Applications. This subsection details several examples that illustrate the 
explicit determination of exponential-polynomial forms in regular specifications. Var- 
ious types of estimates conforming to Theorems V.1 and V.2 are obtained. 


— We start by recapitulating a collection of combinatorial problems (a “pot- 
pourri”, Example 1) already encountered in Chapters I-III, where rational 
function asymptotics has been used en passant . 

— Next, we show how to develop a complete analysis of runs of consecutive 
equal letters in random sequences (Example 2): this is in theory a special 
case of the analysis of patterns in random texts (Section IV. 6.3, p. 257), but 
the particular nature of the patterns makes it possible to derive much more 
explicit results, including limit distributions for longest runs. 


Equivalently, one may operate at generating function level and observe that the derivative of a Ratt 
function is Rat*; cf Notes 1 and 2. 
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Class Asymptotics 
Integer compositions grat 

k-1 
— k summands ~ =i (81.3.1, p. 42) 
— summands < r ~ cbr (81.3.1, p. 40) 
Integer partitions 

nk-1 
— k summands ~ cra (81.3.1, p. 42) 
— summands < r ~ 04 (81.3.1, p. 41) 
Set partitions, k classes ~ Ee (81.4.3, p. 58) 
Words excluding a pattern p ~ cBy (8IV. 6.3, p. 257) 


FIGURE V.2. A potpourri of regular classes and their asymptotics. 


— We then examine walks of the pure birth type (Example 3) that turn out to 
have applications to the analysis of a probabilistic algorithm (Approximate 
Counting, Example 4). 

— Finally, we present a mean and variance analysis of the occurrence of hidden 
patterns in random texts (subsequences, Example 5), which is sufficient to 
entail the concentration of distribution property. 


EXAMPLE V.1. A potpourri of regular specifications. We gather here a few combinatorial 
problems to be found scattered across Chapters I-IV that are reducible to regular specifications; 
see also Figure 2 for a summary. 

Compositions of integers (Section I.3, p. 37) are specified by C = SEQ(SEQs,(Z)), 
whence the OGF (1 — z)/(1 — 2z) and the closed form C,, = 2”~', an especially trivial 
exponential-polynomial form. Polar singularities are also present for compositions into k sum- 
mands that are described by SEQ;(SEQs,(Z)) and for compositions whose summands are 
restricted to the interval [1.. 7] (.e., SEQ(SEQ, _ ,.(Z)), with corresponding generating func- 
tions 


zk l-z 
(1 — z)k’ L-2z4 271° 
In the first case, there is an explicit form for the coefficients, Co) which constitutes a partic- 


ular exponential-polynomial form (with the basis of the exponential being 1). The second case 
requires a dedicated analysis of the dominant polar singularity. (Example 2 below treats the 
closely related problem of determining longest runs in random binary words.) 

Integer partitions involve the multiset construction. However, when summands are re- 
stricted to the interval [1 .. 1], the specification and the OGF are given by 


1 
Led 


MSET(SEQ, ..,(Z)) ~ SEQ(Z) x SEQ(Z”) x ---SEQ(Z") => ll 


This case first introduced in Section I.3 (p. 37) has also served as a leading example in our 
discussion of denumerants in Example IV.5 (p. 244), where the analysis of the pole at 1 fur- 
nishes the dominant asymptotic behaviour, n”~1/(r!(r — 1)!), for such special partitions. The 
enumeration of partitions by number of parts then follows, by duality, from the staircase repre- 
sentation. 

Set partitions are typically labelled objects. However, when suitably constrained, they can 
be encoded by regular expressions; see Section I. 4.3 (p. 58) for partitions into k classes, where 
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the OGF found is 


zk k” 
a ee ey implyi (kK) | BL 
Gao Pe 


(k) = 


and the asymptotic estimate results from the partial fraction decompsoition and the dominant 
pole at 1/k. 

Words lead to many problems that are prototypical of the regular specification framework. 
In Section I. 4 (p. 47), we saw that one could give a regular expression describing the set of 
words containing the pattern abb, from which the exact and asymptotic forms of counting coef- 
ficients derive. For a general pattern p, the generating functions of words constrained to include 
(or dually exclude) p are rational. The corresponding asymptotic analysis has been given in 
Section IV. 6.3 (p. 257). 

Words can also be analysed under the Bernoulli model, where letter 7 is selected with 
probability p;; cf Section III. 6.1 for a general discussion including the analysis of records in 
random words (p. 179). 00.0... eee cece eee eee ene e teens END OF EXAMPLE V.1. 


> V.6. Partially commutative monoids. Let W = A* be the set of all words over a finite 
alphabet A. Consider a collection C of commutation rules between pairs of elements of A. For 
instance, if A = {a,b,c}, then C = {ab = ba, ac = ca} means that a commutes with both b 
and c, but bc is not a commuting pair: bec 4 cb. Let M = W/|C] be the set of equivalent 
classes of words (monomials) under the rules induced by C. The set M is said to be a partially 
commutative monoid or a trace monoid [80]. 

If A = {a,b}, then the two possibilities for C are C = and C := {ab = ba}. Normal 
forms for M are given by the regular expressions (a+ )* and a*b* corresponding to the OGFs 


ae A 
1l=@—d’ l1—a—b+ab 


If A = {a, b, c}, the possibilities for C, the corresponding normal forms, and the OGFs M are 
as follows. If C = 0, then M ~ (a +b+c)* with OGF (1 — a —b—c)~'; the other cases are 


ab = ba ab = ba, ac = ca ab = ba, ac = ca, bc = cb 
(a*b*c)*a*b* a*(b+c)* a*b*c* 
pe ky, Ge ey, fe ee 
1—a—b-—c+ab 1—a—b-—c+abt+ac 1-—a—b—c+ab+ac+be- abc’ 


Cartier and Foata [80] have discovered the general form (based on extended Mobius inversion), 


M= (sy) 


F 


where the sum is over all monomials /’ composed of distinct letters that all commute pairwise. 
Goldwurm and Santini [239] have shown that [z”]|M(z) ~ K - a” for K,a > 0. <J 


EXAMPLE V.2. Longest runs in words Longest runs in words introduced in Section I. 4.1 
(p. 47) provide an illustration of the technique of localizing dominant singularities in rational 
functions and of the corresponding coefficient extraction process. The probabilistic problem is a 
famous one, discussed by Feller in [161], as it represents a basic question in the analysis of runs 
of good (or bad) luck in a succession of independent events. Our presentation closely follows 
an insightful note of Knuth [303] whose motivation was the analysis of carry propagation in 
certain binary adders. 


Start from the class W of all binary words over the alphabet {a,b}. Our interest lies in 
the length L of the longest consecutive block of a’s in a word. For the property L < k, the 
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specification and the corresponding OGF are 


1—2* 1 
W") = SEQ<x(a) SEQ(DSEQ<n(@)) = Wz) =F ae 

a z& 

1l-z 
that is, 
1—2* 

11 wz) = ——_* 
(11) as pe PIES Ts) 


This represents a collection of OGFs indexed by k, which contain all the information relative to 
the distribution of longest runs in random words. We propose to prove: 


Proposition V.1. The longest run parameter L taken over the set of binary words of length n 
(endowed with the uniform distribution) satisfies the uniform estimate? 


(122) Pa(L <|len| +h) =e?" 40 (3) , a(n) = 208, 


In particular, the mean satisfies 


re 3 log? n 
En(L) = lgn4 eo Pllgn) +0 ( Ta J 


where P is a continuous periodic function whose Fourier expansion is given by (20). The 
variance satisfies V,(L) = O(1) and the distribution is concentrated around its mean. 

The probability distributions appearing in (12) are known as double exponential distributions 
(Figure 3). The formula (12) does not represent a single limit distribution in the usual sense of 
Chapter IX, but rather a whole family of distributions indexed by the fractional part of lg n, thus 
dictated by the way n places itself with respect to powers of 2. 

PROOF. The proof consists of the following steps: locate the dominant pole; estimate the 
corresponding contribution; separate the dominant pole from the other poles in order to derive 
constructive error terms; finally approximate the main quantities of interest. 


(i) Location of the dominant pole. The OGF W“*” has, by the first form of (11) a dominant 
pole px which is a root of the equation 1 = s(p,), where s(z) = z(1 — z*)/(1 — z). We 
consider k > 2. Since s(z) is an increasing polynomial and s(0) = 0, s(5) < 1, s(1) =k, the 
root px must lie in the open interval (4, 1). In fact, as one easily verifies, the condition k > 2 
guarantees that s(0.6) > 1, hence the refined estimate 
(13) ; < pr < : (k > 2). 

It now becomes possible to derive very precise estimates by bootstrapping. (This technique is a 
form of iteration for approaching a fixed point—its use in the context of asymptotic expansions 
is detailed in De Bruijn’s book [111].) Writing the defining equation for p, as a fixed point 


equation, 


z= 5 apgh*): 


and making use of the rough estimates (13) yields next 
1 Lp 1 Soph 
14 =~(14+(- ={14+(- : 
us 4G) <a <3 + 
Thus, px is exponentially close to 3, and further iteration from (14) shows 


1 1 k 
(15) p= 5+anr+0(sr). 


3The symbol lg x denotes the binary logarithm, lg x = logs x. 
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which constitutes a very precise estimate. 


(ii) Contribution from the dominant pole. A straightforward calculation provides the value 
of the residue, 


(16) Rn,k = — Res [w (z)z "I 3z= pr| = ee 
2—(k+ pg 


which is expected to provide the main approximation to the coefficients of W‘*) as n — oo. 


—n/2k+1 


The quantity in (16) is of the rough form 2”e : we shall return to such approximations 


shortly. 
(iti) Separation of the subdominant poles. Consider the circle |z| = 3 and take the second 
form of the denominator of w), namely, 


1- 224 2h, 


In view of Rouché’s theorem, we may regard this polynomial as the sum f(z) + g(z), where 
f(z) = 1—2zand g(z) = z***. The term f(z) has on the circle |z| = 2 a modulus that varies 
between 3 and 3; the term g(z) is at most 24 for any k > 2. Thus, on the circle |z| = 3, one 
has |g(z)| < |f(z)|, so that f(z) and f(z) + g(z) have the same number of zeros inside the 
circle. Since f(z) admits z = 3 as only zero there, the denominator must also have a unique 
root in |z| < 3, and that root must coincide with pp. 

Similar arguments also give bounds on the error term when the number of words w satis- 
fying L(w) < k is estimated by the residue (16) at the dominant pole. On the circle |z| = 3, 
the denominator of W‘*? stays bounded away from 0 (its modulus is at least a when k > 2, 
by previous considerations). Thus, the modulus of the remainder integral is O((4/3)”), and in 
fact bounded from above by 35(4/3)”. In summary, letting gn,x represent the probability that 


the longest run in a random word of length n is less than k, one obtains the main estimate 


_ k 1 n+1 2 
(17) Qn,k = Pr(b<k)= =e (=) +0 (@) ) 


which holds uniformly with respect to k. Here is table of the numerical values of the quantities 
appearing in the approximation of gn,x when written under the form cz - (2p%)~”: 


k Ck > (2px) ~” 
1.13745 - 0.91964” 
1.09166 - 0.96378” 
1.05753 - 0.98297” 
1.00394 - 0.99950” 


Oo & 


1 


(iv) Final approximations. There only remains to transform the main estimate (17) into 
the limit form asserted in the statement. First, the “tail inequalities” (lg x = log, x) 


—2y 
(18) Pa (z< tien) =0 (ce), Pa(b > 2ign+y)=0(5 ). 


n 


describe the tail of the probability distribution of L,,. They derive from simple bounding tech- 
niques applied to the main approximation (17) using (15). Thus, for asymptotic purposes, only 
a small region around lg n needs to be considered. 

Regarding the central regime, for k = lgn + x and x in [-+ lg n, lg nj, the approxima- 
tion (15) of px and related quantities applies, and one finds 


—n eaten —2ky\ _ —n/2kth log n 
(2px) = exp ( seer + O(hn2 )) =e (1+0(882)), 
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ome 46 B10 12 «14 


FIGURE V.3. The double exponential laws: Left, histograms for n at 2? (black), 2?+1/3 
(dark gray), and gpt2/3 (light gray), where « = k — lg n. Right, empirical histograms for 
1000 simulations with n = 100 (top) and n = 140 (bottom). 


(This results from standard expansions like (1— a)” = e~"* exp(O(na’”)).) At the same time, 
the coefficient of this quantity in (17) is 


logn 

1+ O(kpk) =1+0 

+ o(kol) =1+0 (282) 

Thus a double exponential approximation holds (Figure 3): fork = lgn + a with x in 
[—4 Ign, lg n], one has (uniformly) 


en faktt logn 
(19) dn,k =e (1+0(*82)). 


In particular, upon setting k = |lgn| + h and making use of the tail inequalities (18), the first 
part of the statement, namely Equation (12), follows. (The floor function takes into account the 
fact that & must be an integer.) 

The mean and variance estimates derive from the fact that the distribution quickly decays 
at values away from lg n (by (18)) while it satisfies Equation (19) in the central region. The 
mean satisfies 


En(L) = 0 [1 -Pa(Z <h)] = 0(2)-140 ("=") | a@) = fie"). 


h>1 h>0 


Consider the three cases h < ho, h € [ho, hi], and h > hi with ho = lg ax — log log x and 
hi = lg a + log log x, where the general term is (respectively) close to 1, between 0 and 1, and 
close to 0. By summing, one finds elementarily ®(x) = lg x + O(log log x) as x — oo. (An 
elementary way of catching the next O(1) term is discussed for instance in [434, p. 403].) 

The method of choice for precise asymptotics is to treat ®(x) as a harmonic sum and apply 
Mellin transform techniques (APPENDIX B: Mellin Transform, p. 707). The Mellin transform 
of (x) is 


&*(s) := I G(x)x*' dx = R(s) € (1,0). 
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The double pole of ®* at 0 and the simple poles at s = = 33 are reflected by the asymptotic 
expansion: 
(20) 
x 1 2ikn —Qiknw 
® l — Pil P = —-——— T ; 
(«) = wer yt5+Plea)40e), Pw) Dr (FS)e 


kEZ\{O} 


The oscillating function P(w) has amplitude of the order of 10~°. (See [184, 252, 303, 458] 
for more on this topic.) The variance is similarly analysed. This concludes the proof of Propo- 
sition V.1. 


The double exponential approximation in (12) is typical of extremal statistics. What is 
striking here is the existence of a family of distributions indexed by the fractional part of lg n. 
This fact is then reflected by the presence of oscillating functions in moments of the random 
Watlable: Ds cmc a ite Mout ay meniiieed sists. St Nate Seller So fe END OF EXAMPLE V.2. 


> V.7. Longest runs in Bernoulli sequences. Consider an alphabet A = {a;} with letter a; 
independently chosen with probability {p;}. The OGF of words where each run of equal letters 
has length at most k derives from the construction of Smirnov words (pp. 193 and 249), and it 


is found to be _ 
Kc, z) 
Ww )=(1- Sp poten | 


Let pmax be the largest of the p;. Then the expected length of the longest run of any letter is 
log n/ log pmax + O(1), and precise quantitative information can be derived from the OGFs by 
methods akin to Example IV.9 (Smirnov words and Carlitz compositions, p. 249). dq 


Walks of the pure birth type. The next two examples develop the analysis of 
walks in a special type of graphs. These examples serve two purposes: they illustrate 
further cases of modelling by means of regular specifications, and, at the same time, 
provide a bridge to the analysis of lattice paths in the next section. 


EXAMPLE V.3. Walks of the pure-birth type. Consider a walk on the nonnegative integers 
that starts at O and is only allowed either to stay at the same place or move by an increment 
of +1. Our goal is to enumerate the walks that start from 0 and reach point m — 1 in n steps. A 
step from 7 to 7 + 1 will be encoded by a letter a;; a step from 7 to 7 will be encoded by c;, in 
accordance with the following state diagram: 

c 


(21) 


a) ay a) 
The language encoding all legal walks from state 0 to state m can be described by a regular 
expression, 


Ho,m = SEQ(co)ao SEQ(C1)a1 ++» SEQ(Em—1)@m—1 SEQ(Cm). 


Symbolicly using letters as variables, the corresponding ordinary multivariate generating func- 


tion is then 
a0a1°**Am-1 


(1—co)(1—c1)---—em)’ 

Assume now that the steps are assigned weights, with a; corresponding to a; and ¥; to c;. 
Weights of letters are extended multiplicatively to words in the usual way (cf Section III. 6.1, 
p. 178). In addition, upon taking y; = 1— aj, one obtains a probabilistic weighting: the walker 


H,m(G, @) = 
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FIGURE V.4. A simulation of 10 tra- 
6 jectories of the pure-birth process till 
4 n = 1024, with geometric probabili- 
ties corresponding to g = 1/2, com- 
pared to the curve log, x. 


fe) 200 400 600 800 1000 


starts from position 0, and, if at 7, at each clock tick, she either stays at the same place with 
probability 1 — aj; or moves to the right with probability a;. The OGF of such weighted walks 
then becomes 


QoQ +++ Am—12"" 

(1 — (1 — ao)z)(1 — (1— a1)z) +++ (1 = (1 — am 2)’ 
and [z"]Ho,m is the probability for the walker to be found at position m at (discrete) time n. 
This walk process can be alternatively interpreted as a (discrete-time) pure-birth process‘ in 
the usual sense of probability theory: There is a population of individuals and, at each discrete 
epoch, a new birth may take place, the probability of a birth being a; when the population is of 
size j. 

The form (22) readily lends itself to a partial fraction decomposition. Assume for simplic- 
ity that the a; are all distinct. The poles of Ho,m are at the points (1 — a;)~+ and one finds as 


z—(l—a,)7?: 


(22) Ho,m(z) = 


Q0QA1°**Am-1 


Il (am — a5)” 


kE[0,m], kAj 


Tim 
Ho,m(z) ~ T=) where lism i= 


Thus, the probability of being in state m at time n is given by a sum: 


(23) [2"]Ho,m(z) = }_13j,m(1 — 05)”. 
j=0 
An especially interesting case of the pure-birth walk is when the quantities a, are geomet- 
ric: a, = q” for some q with 0 < q < 1. In that case, the probability of being in state m 
after n transitions becomes (cf (23)) 


m j J 
ay SOU Gry pet gdm e) 
j=0 (9) (@)m—3 

This corresponds to a stochastic progression in a medium with exponentially increasing hard- 
ness or, equivalently, to the growth of a population whose size adversely affects fertility in an 
exponential manner. On intuitive grounds, we expect an evolution of the process to stay reason- 
ably close to the curve y = log, ,, x; see Figure 4 for a simulation confirming this fact, which 
can be justified by means of formula (24). This particular analysis is borrowed from [172], 
where it was initially developed in connection with the “approximate counting” algorithm to be 


StUdIEd NEXts ideas cee tk des Saher ea END OF EXAMPLE V.3. 


‘The theory of pure-birth processes is discussed under a calculational and non measure-theoretic an- 
gle in the book by Bharucha-Reid [50]. See also the Course by Karlin and Taylor [290] for a concrete 
presentation. 
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EXAMPLE V.4. Approximate Counting. Assume you need to keep a counter that is able to 
record the number of certain events (say impulses) and should have the capability of keeping 
counts till a certain maximal value N. A standard information-theoretic argument (with @ bits, 
one can only keep track of 2° possibilities) implies that one needs flog, N +1] bits to perform 
the task—a standard binary counter will indeed do the job. However, in 1977, Robert Morris has 
proposed a way to maintain counters that only requires of the order of log log N bits. What’s 
the catch? 

Morris’ elegant idea consists in relaxing the constraint of exactness in the counting process 
and, by playing with probabilities, tolerate a small error on the counts obtained. Precisely, his 
solution maintains a random quantity Q which is initialized by @ = 0. Upon receiving an 
impulse, one updates Q according to the following simple procedure (with gq € (0,1) a design 
parameter): 


procedure Update(Q); 
with probability g@++ do Q := Q + 1 (else keep Q unchanged). 
When asked the number of impulses (number of times the update procedure was called) at any 
moment, simply use the following procedure to return an estimate: 


procedure Answer(Q); 
nt 
output X = ————_. 
l—q 
Let Qn be the value of the random quantity Q after n executions of the update procedure 
and X,, the corresponding estimate output by the algorithm. It is easy to verify (by recurrence 


or by generating functions; see Note 8 below for higher moments) that 


(25) E(q 2°") =n(1—q) #1, so that E(Xn) =n. 


Thus the answer provided at any instant is an unbiased estimator (in a mean value sense) of 
the actual count n. On the other hand, the analysis of the geometric pure-birth process in the 
previous example applies. In particular, the exponential approximation (1 — a)” = e "* 
in conjunction with the basic formula (24) shows that for large n and m sufficiently near to 
log /4 , one has (asymptotically) the geometric-birth distribution 

— . (jt1 

(-1)ig?) 

26) P(Q,n=m)=y 42 

2 (9) (Wee 
(We refer to [172] for details.) Such calculations imply that @,, is with high probability (w.h.p.) 
close to log, ,,. Thus, ifn < N, the value of Q, will be w.h.p. bounded from above by 
(1 + €) log,,, N, with € a small constant. But this means that the integer Q, which can itself 
be represented in binary, will only require 


(27) log, logn + O(1) 


bits for storage, for fixed q. 

A closer examination of the formule reveals that the accuracy of the estimate improves 
considerably when q becomes close to 1. The standard error is defined as = V(X) and it 
measures (in a mean quadratic sense) the relative error likely to be made. The variance of Qn 


is, like the mean, determined by recurrence or generating functions, and one finds 


(28) vege) = ("3 1) Cae = JW Kn) ~ au 


q 2q 


exp(—q” ’) + o(1), z =m — logy /_2- 
j=0 


(see also Note 8 below). This means that accuracy increases as gq approaches 1 and, by suitably 
dimensioning g, one can make it asymptotically as small as desired. In summary, (25), (28), 


292 V. APPLICATIONS OF RATIONAL AND MEROMORPHIC ASYMPTOTICS 


and (27) express the following property: Approximate counting makes it possible to count till N 
using only about log log N bits of storage, while achieving a standard error that is asymptot- 
ically a constant and can be set to any prescribed small value. Morris’ trick is now fully 
understood. 

For instance, with q = 27 , it proves possible to count up to = 65536 using only 
8 bits (instead of 16), with an error likely not to exceed 20%. Naturally, there’s not too much 
reason to appeal to the algorithm when a single counter needs to be managed. (Everybody can 
afford a few bits!) Approximate Counting turns out to be useful when a very large number of 
counts need to be kept simultaneously. It constitutes one of the early examples of a probabilistic 
algorithm in the extraction of information from large volumes of data, an area also known as 
data mining; see [177] for a review of connections with analytic combinatorics and references. 


1/16 916 


Functions akin to those of (26) also surface in other areas of probability theory. Guillemin, 
Robert, and Zwart [255] have detected them in processes that combine an additive increase and 
a multiplicative decrease (AIMD processes), in a context motivated by the adaptive transmis- 
sion of “windows” of varying sizes in large communication networks (the TCP protocol of the 
internet). Biane, Bertoin, and Yor [48] encountered a function identical to (26) in their study of 


exponential functionals of Poisson processes. .............-20005 END OF EXAMPLE V.4. 


> V.8. Moments of q~°”. It is a perhaps surprising fact that any integral moment of g~@” isa 
polynomial in n and q, like in (25), (28). To see it, define 


= re m(m-+1)/2 gnw™ 
RO) enka) = De” Taare) OY 
By (22), one has 
1 
S Hom(2w" = 4-0 (m p= .0), 


m>0 


On the other hand, ® satisfies ®(w) = 1 — g&(1 — w)®(qw), hence the q-identity, 
&(w) = >(-a€)? [@ - w)(1 - qv) (= tw] , 


j20 

which resorts to q-calculus®. Thus 6(q~"; €,q) is a polynomial for any r € Z>0, as the ex- 
pansion terminates. See Prodinger’s study [403] for connections with basic hypergeometric 
functions and Heine’s transformation. 

Hidden patterns: Regular expression modelling and moments. We return here 
to the analysis of the number of occurrences of a pattern p as a subsequence in a 
random text. The mean number of occurrences can be obtained by enumerating con- 
texts of occurrences: in a sense we are then enumerating the language of all words by 
means of a dedicated regular expression where the ambiguity coefficient (the multi- 
plicity) of a word is precisely equal to the number of occurrences of the pattern. This 
technique, which gives an easy access to expectations, also works for higher moments. 
It supplements the fact that there is no easy way to get a BGF in such cases. 


EXAMPLE V.5. Occurrences of “hidden” patterns in texts. Fix an alphabet A = {a1,..., ar} 
of cardinality r and assume a probability distribution on A to be given, with p; the probability 
of letter aj. We consider the Bernoulli model on W = SEQ(.A), where the probability of a word 


By q-calculus is roughly meant the collection of special function identities relating power series of 
the form $* an(q)z”, where an (q) is a rational fraction whose degree is quadratic in n. See [11, Ch. 10] 
for basics and [230] for more advanced (q—hypergeometric) material. 
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is the product of the probabilities of its letters (cf Section II. 6.1, 178). A word p = y1--- yx 
called the pattern is fixed. The problem is to gather information on the random variable X 
representing the number of occurrences of p in the set W,, where occurrences as a “hidden 
pattern”, L.e., as a subsequence, are counted (Example I.11, p. 51). 

The generating function associated to W endowed with its probabilistic weighting is 


_ 1 _ ol 
~ 1-SSpjz 1-2 


The regular specification 
(29) O = SEQ(A)y1 SEQ(A) --- SEQ(A)yx—1 SEQ(A) yx SEQ(A) 


describes all contexts of occurrences of p as a subsequence in all words. Graphically, this may 
be rendered as follows for a pattern of length 3, p = yiyoys: 


(0 


There the boxes indicate distinguished positions where letters of the pattern appear and the 
horizontal lines represent arbitrary separating words (SEQ(A)). The corresponding OGF 


k 
G1) O02) = Ot) = Pa PP 


counts elements of W with multiplicity’, where the multiplicity coefficient \(w) of a word w € 
W is precisely equal to the number of occurrences of p as a subsequence in w: 


O(z) = S> Aw)r(w)z!"|, 
weA* 
There results that the mean number of hidden occurrences of p in a random word of length n is 


(32) [2"]O(z) = m(p) (;: 


which is consistent with what a direct probabilistic reasoning would give. 

We next proceed to determine the variance of X over W,,. In order to do so, we need 
contexts in which pairs of occurrences appear. Let Q denote the set of all words in W with two 
occurrences (i.e., an ordered pair of occurrences) of p as a subsequence being distinguished. 
Then clearly [2”]Q(z) represents Evy, |X]. There are several cases to be considered. Graphi- 
cally, a pair of occurrences may share no common position, like in what follows: 


x 


But they may also have one or several overlapping positions, like in 
Y y2 Y3 

ou ae 

be 


(This last situation necessitates y2 = y3, typical patterns being abb and aaa.) 


6 In language-theoretic terms, we are making use of the regular expression O = 
A*y1A®* +++ yp—1A* yp A*, that describes a subset of A* in an ambiguous manner and take into account 
the ambiguity coefficients. 
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In the first case corresponding to (33), where there are no overlapping positions, the con- 
figurations of interest have OGF 


22k 
(36) ule & cee 


There, the binomial coefficient eg) counts the total number of ways of freely interleaving two 
copies of p; the quantity n(p)?22* takes into account the 2k distinct positions where the letters 
of the two copies appear; the factor (1 — z)~?*~1 corresponds to all the possible 2k + 1 fillings 
of the gaps between letters. 

In the second case, let us start by considering pairs where exactly one position is overlap- 
ping, like in (34). Say this position corresponds to the rth and sth letters of p (r and s may be 
unequal). Obviously, we need y, = ys for this to be possible. The OGF of the configurations 


is now 
r+s—2)\ (2k—r—s)\ a(p)?(py,.)7 12787! 
r=1 k—r (1 — z)?k : 
r+s—2 
r-1 


There, the first binomial coefficient ( ) counts the total number of ways of interleaving 


Y1-++ Yr—1 and yi --- ys—1; the second binomial Ce is similarly associated to the inter- 
leavings of Yr41°-: Yr and Ys41°-+ Yr; the numerator takes into account the fact that 2k — 1 
positions are now occupied by predetermined letters; finally the factor (1 — z) corresponds 
to all the 2k fillings of the gaps between letters. Summing over all possibilities for r, s gives the 


OGF of pairs with one overlapping position as 


Gh: OMG) a(S cag Gs 4 eH —_ 


1<r,s<k Pyr 


Similar arguments show that the OGF of pairs of occurrences with at least two shared 
positions (see, e.g., 35)) is of the form, with P a polynomial, 
P(2) 
(1 — z)2e-1” 


for the essential reason that, in the finitely many remaining situations, there are at most (2k — 1) 
possible gaps. 

We can now examine (36), (37), (38) in the light of singularities. The coefficient [z”] Ql (z) 
is seen to cancel to first asymptotic order with the square of the mean as given in (32). The 
contribution of the coefficient [2”]Q!=7!(z) appears to be negligible as it is O(n?*~?). The 
coefficient [z”]QU(z), which is O(n?*~*), is seen to contribute to the asymptotic growth of 
the variance. In summary, after a trite calculation, we obtain: 


(38) Qgi2I(z) = 


Proposition V.2.. The number X of occurrences of a hidden pattern p in a random text of size n 
obeying a Bernoulli model satisfies 


Ew, [X] = ~o(f] ~ HP) A, SOAP ee =e n2k-l (1 +0(2)) 


where the “correlation coefficient” «(p)? is given by 


se 2 (" - 1 ’ ie es ‘ Ga 7 :) , 


In particular, the distribution of X is concentrated around its mean. 
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This example is based on an article by Flajolet, Szpankowski, and Vallée [215]. There the 
authors show further that the asymptotic behaviour of moments of higher order can be worked 
out. By the moment convergence theorem, this calculation entails that the distribution of X 
over W,, is asymptotically normal. The method also extends to a much more general notion 
of “hidden” pattern, e.g., distances between letters of p can be constrained in various ways 
so as to determine a valid occurrence in the text [215]. It also extends to the very general 
framework of dynamical sources [65], which include Markov models as a special case. The 
two references [65, 215] thus provide a set of analyses that interpolate between the two extreme 
notions of pattern occurrence—as a block of consecutive symbols or as a subsequence (“hidden 
pattern”). Such studies demonstrate that hidden patterns are with high probability bound to 
occur an extremely large number of times in a long enough text—this might cast some doubts 
on numerological interpretations encountered in various cultures: see in particular the critical 
discussion of the “Bible Codes” by McKay et al. in [354]. ....... END OF EXAMPLE V.5. 


> V.9. Hidden patterns and shuffle relations. To each pairs u,v of words over A associate 
the weighted-shuffle polynomial in the indeterminates A denoted by (“), and defined by the 


properties (3) 7 (.:) - (“")) +t[x = a]e (‘:)) 
()-@)-" 


where t is a parameter, x, y are elements of A, and 1 is the empty word. Then the OGF of Q(z) 


above is 
vv((0), re 


where g is the substitution aj; +> p;z. <q 


V.3. Nested sequences, lattice paths, and continued fractions. 


This section treats nested sequence constructions corresponding to a schema in- 
volving a cascade of sequences of the rough form SEQo SEQo:--o SEQ. Such a 
schema covers Dyck and Motzkin path, a particular type of Lukasiewicz paths al- 
ready encountered in Section I. 5.3 (p. 68). Equipped with probabilistic weights, these 
paths appear as trajectories of birth-and-death processes (the case of pure-birth pro- 
cesses has already be dealt with in Example 3 above). They also have great descriptive 
power since, once endowed with integer weights, they can encode a large variety of 
combinatorial classes, including trees, permutations, set partitions, and surjections. 

Since a combinatorial sequence translates into a quasi-inverse, Q(f) = (1—f)~+, 
a class described by nested sequences has its generating function expressed by a cas- 
cade of fractions, that is, a continued fraction’. Analytically, these GFs have at most 
two dominant poles (the Dyck case) or a single pole (the Motzkin case) on their disc 
of convergence, so that the implementation of the process underlying Theorem V.1 
is easy: we encounter a pure polynomial form of the simplest type that describes all 


7Characteristically, the German term for “continued fraction”, is “Kettenbruch”, literally “chain- 
fraction”. 
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counting sequences of interest. The profile of a nested sequence can also be easily 
characterized. 

This section starts with a statement of the “Continued Fraction Theorem” taken 
from an old study of Flajolet [168], which provides the general set up for the rest of 
the section. It then proceeds with the general analytic treatment of nested sequences. 
A number of examples from various areas of discrete mathematics are then detailed. 
Some of these make use of structures that are described as infinitely nested sequences, 
that is, infinite continued fractions, to which the finite theory often extends—the anal- 
ysis of coin fountains below is typical. 


V. 3.1. Combinatorial aspects. We discuss here a special type of lattice paths 
connecting points of the discrete Cartesian plane Z x Z. 


Definition V.3 (Lattice path). A Motzkin path v = (Uo, U1,...,Un) is a sequence 
of points in the discrete quarter plane Z>o xX Z>o such that U; = (j,y;), and the 
jump condition |y;+1 — y;| < 1is satisfied. An edge (U;,U;+1) is called an ascent if 
Yjt1 — Yj = +1, a descent if yj41 — yj; = —1, and a level step if yj41 — yj = 0. A 
path that has no level steps is called a Dyck path. 

The quantity n is the length of the path, ini(v) := Yo is the initial altitude, 
fin(v) := Yn is the final altitude. A path is called an excursion if both its ini- 
tial and final altitudes are zero. The extremal quantities sup{v} := max; y; and 
inf{v} := min, y; are called the height and depth of the path. 


A path can always be encoded by a word with a, b,c representing ascents, de- 
scents, and level steps, respectively. What we call the standard encoding is such a 
word in which each step a,b,c is (redundantly) subscripted by the value of the y- 
coordinate of its initial point. For instance, 


e 
W = Co Ag Q1 A2 bg C2 C2 Ag bg be by ag C1 


encodes a path that connects the initial point (0,0) to the point (13,1). Such a path 
can also be regarded as the evolution in discrete time of a walk over the integer line 
with jumps restricted to {—1, 0, +1}, or equivalently as a path in the graph: 


Lattice paths can also be interpreted as trajectories of birth-and-death processes, where 
a population can evolve at any discrete time by a birth or a death. (Compare with the 
pure-birth case in (21) above.) 
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As a preparation for later developments, let us examine the description of the 


class written The. of Motzkin excursions of height < 1. We have 


1 
Hoo! = SEQ(co) = => Hp (2) = = 


The class of excursions of height < 2 is obtained from there by a substitution 
Co = Co + ag SEQ(c1)b1, 
to the effect that 


HI<?] = SEQ (co + ap SEQ(c1)b1) 


1 1-— Cl 
=> Hl<2l(z) = —— 
2) agb1 1 Co cy 4 Coc agb1 


Iteration of this simple mechanism lies at the heart of the calculations performed be- 
low. Clearly, generating functions written in this way are nothing but a concise de- 
scription of usual counting generating functions: for instance if individual weights® 
aj, 3;,7; ave assigned to the letters a;,b;,c; respectively, then the OGF of multi- 
plicatively weighted paths with z marking length is obtained by setting 


(39) aj = Aj;2, bj = 352, Cy = 152. 


The general class of paths of interest in this subsection is defined by arbitrary 
combinations of flooring (by m) ceiling (by h), as well as fixing initial (k) and final 
(1) altitudes. Accordingly, we define the following subclasses of the class H of all 
Motzkin paths: 


ame <= {wEéH: ini(w) =k, fin(w) =1, m < inf{w}, sup{w} < h}. 
We shall also need the specializations, 
<h O<e<h >m m<e<oo O0<e<co 
HP a PSM, HM = A, Figg = HALE, 


(Thus, the supercript indicates the condition that is to be satisfied by all abscissae of 
vertices of the path.) Three simple combinatorial decompositions of paths (Figure 5) 
then suffice to derive all the basic formule. 
— Arch decomposition: An excursion from and to level 0 consists of a sequence 
“ee ” 3: = 
of “arches”, each made of either a co or a ao a b,, so that 


(40) Ho,o = SEQ (co U ao?\3"b1) , 


which relativizes to height < h. 
— Last passages decomposition. Recording the times at which each level 
0,..., kis last traversed gives 


(41) Hon = WES aot ay - +a, He 


8Throughout this chapter, all weights are assumed to be nonnegative. 


298 V. APPLICATIONS OF RATIONAL AND MEROMORPHIC ASYMPTOTICS 


FIGURE V.5. The three major decompositions of lattice paths: the arch decomposition 
(top), the last passages decomposition (bottom left), and the first passage decomposition 
(bottom right). 


— First passage decomposition. The quantities H;,; with k < 1 are implicitly 
determined by the first passage through k in a path connecting level 0 to J, 
so that 


(42) Tow = Hie aeatien eS, 
(A dual decomposition holds when k > 1.) 


The basic results express the generating functions in terms of a fundamental con- 
tinued fraction and its associated convergent polynomials. They involve the “numera- 
tor” and “denominator” polynomials, denoted by P;, and Q), that are defined as solu- 
tions to the second order (or “three-term’”’) recurrence equation 


(43) Yn4i = (1—¢n)¥n — Gn—1bnYn-1, h > 0, 


together with the initial conditions (P_1,Q_1) = (—1,0), (Po, Qo) = (0,1), and 
with the convention a_;bo = 1. In other words, setting C; = 1—c; and A; = a;_1;, 
we have: 
(44) 

P=0, P=1, P2=Ci, P3 = C\C2 — Ag 

Qo=1, Q1=Co, Q2=CoCi1— Ai, Q3 = CoC C2 — Co A1 — CoAr. 


These polynomials are also known as continuant polynomials [308, 489]. 


> V.10. Combinatorics of continuant polynomials. The polynomial @», is obtained by the fol- 
lowing process: start with the product II := CoC1 --- Cn—1; then cross out in all possible ways 
pairs of adjacent elements C’;_1C;, replacing each such crossed pair by —A;. For instance, Qa 
is obtained as 

—Al —A2 —A3 —-A, —Az3 


(craps Com, Ce ee fs, 
CoCi1C2C3 + GoGi C23 + Co G4G2 C3 + CoC G2G3 + G1 G2 . 


The polynomials P;, are obtained similarly after a shift of indices. (These observations are due 
to Euler; see [248, §6.7].) J 
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Proposition V.3 (Continued Fraction Theorem [168]). (i) The generating function 
0,9 of all excursions is represented by the fundamental continued fraction: 


1 
(45) Aoo = 
aob1 
l—c- 
aybe 
l-cq- b 
203 
(ay ee ae 


(it) The generating function of ceiled excursion ak" is given by a convergent of the 
fundamental continued fraction (45), with P;,, Qn, as in Equation (43): 

1 Pr 
(46) Pp. fp ee, 
O0 agby Qh 


aybo 


l-—o- 


1l-q—- 


1 —cp-1 


(iit) The generating function of floored excursions is given by a truncation of the 
fundamental fraction: 


1 
>h 
(47) HEA = 
anbn+1 
1—cp,- 
Gn410n42 
1 — cng. — ———— 
1 QnHo,o — Pr 


(48) = —— 
an—1bn Qn—1H0,0 — Pr—1 


PROOF. Repeated use of the arch decomposition (40) provides a form of H, in with 


nested quasi-inverses (1 — f)~! that is the finite fraction representation (46), for in- 
stance, 


ish ~ SEQ{co}, HII = SEQ{co + ao SEQ{c1}b1}, 
H{<Sl = SEQ{co + ao SEQ{c1 + a1 SEQ{c2}b2}br}. 


The continued fraction representation for basic paths without height constraints (namely 
Ho,9) is then obtained by letting h — oo in (46). Finally, the continued fraction 
form (47) for ceiled excursions is nothing but the fundamental form (45), when the 
indices are shifted. The three continued fraction expansions (45), (46), (47) are hence 
established. 

Finding explicit expressions for the fractions H, and H ak ! next requires de- 
termining the polynomials that appear in the convergents of the basic fraction (45). 
By definition, the convergent polynomials P;, and @»,, are the numerator and denomi- 


[<h] ] 
0,0 


nator of the fraction H, . For the computation of ie and P;,, Qn, one classically 
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introduces the linear fractional transformations 
1 
95(Y) i 1— Cj = ajbj41y’ 
so that 
(49) Hy = 90°91 9920+ 0 gn—1(0) and Hoo = go 9919 920°". 
Now, linear fractional transformations are representable by 2 x 2-matrices 


ay +b a b 
(50) wis 7) 


in such a way that composition corresponds to matrix product. By induction on the 


compositions that build up ee there follows the equality 


Pr 2 Pr—1an—1bpny 
Qn — Qn—-14n—1bny’ 


where P;, and Q), are seen to satisfy the recurrence (43). Setting y = 0 in (51) 
proves (46). 


(S1) go°91°g929°*-Ogn-i(y) = 


Finally, 1, ay is determined implicitly as the root y of the equation go o --- 0 


gn—1(y) = Ho,o, an equation that, when solved using (51), yields the form (48). 


A large number of generating functions can be derived by similar techniques. We 
refer to the article [168], where this theory was first systematically developed and 
to the exposition given in [244, Chapter 5]. Our presentation also draws upon [188] 
where the theory was put to use in order to develop a formal algebraic theory of general 
birth-and-death processes in continuous time. 


> V.11. Transitions and crossings. The lattice paths Ho,, corresponding to the transitions from 
altitude 0 to / and Hx, (from k to 0) have OGFs 


1 1 
Ho = = (QiHo,0 — Pr), Hro = —(QxrHo,0 — Pr). 
Bi Hi 


The crossings Hos and Hist) o have OGFs, 


pe, at = 
O,h-1 Qn ? h—-1,0 Qn 


hl An—1 HIS’) = Br-1 


(Abbreviations used here are: %m = ao--:Gm-—1, Bm = b1--- bm.) These extensions pro- 
vide combinatorial interpretations for fractions of the form 1/Q. They result from the basic 
decompositions combined with Proposition V.3; see [168, 188] for details. <q 


[> V.12. Denominator polynomials and orthogonality. Let H, = [z"|Ho,o(z) represent 
the number of all excursions of length n equipped with nonnegative weights. Define a linear 
functional £ on the space C(z) of polynomials by L[z”] = Hn. Introduce the reciprocal 
polynomials: Q,,(z) = z”Q(1/z). The fact deducible from Note 11 that Q;Ho,o —P; = O(z') 
corresponds to the property £[z7Q,) = 0 for all 0 < j < 1. In other words, the polynomials 
Q, are orthogonal with respect to the special scalar product (f,g) := L[ fg]. (Historically, the 
theory of orthogonal polynomials evolved from the theory of continued fractions before living 
a life of its own; see [88, 277, 457] for its many facets.) J 
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> V.13. Discrete time birth-and-death processes. Assume that, at discrete times n = 0,1, 2,..., 
a population of size 7 can grow by one element [a birth] with probability a;, decrease by one 
element [a death] with probability G;, and stay the same with probability y; = 1 — a; — {;. 
Let w» be the probability that an initially empty population is again empty at time n. Then the 
GF of the sequence (wn) is 


‘i 1 
SS Wnz = B 2 
a 
n>0 ee oP1 ‘ 
a1 P2z 
l-jiz- 
This result was found by I. J. Good in 1958: see [243]. dq 


> V.14. Continuous time birth-and-death processes. Consider a continuous time birth-and- 
death process, where a transition from state 7 to 7 + 1 takes place according to an exponential 
distribution of rate \; and a transition from j to 7 — 1 has rate j4;. Let c(t) be the probability 
to be in state 0 at time ¢ starting from state 0 at time 0. One has 


- 1 1 
—st 
/ eet) dt =| ——— = 
» » 
0 pete vee of We gra 7 
s+A1.4¢ 1 - — 1+ : 
ge 


Thus, continued fractions and orthogonal polynomials may be used to analyse birth-and-death 
processes. (This fact was originally discovered by Karlin and McGregor [289], with later ad- 
ditions due to Jones and Magnus [285]. See [188] for a systematic discussion in relation to 
combinatorial theory.) dq 


V.3.2. Analytic aspects. We now consider the general asymptotic properties of 
lattice paths of height bounded from above by a fixed integer h > 1. Letters denoting 
elementary steps are weighted, as previously indicated, with 


aj; = A;2, b; = B32, Cj = 52, 


the weights being invariably nonnegative. We shall limit the discussion to excursions, 
which are often the most interesting objects from the combinatorial point of view. 

As a preamble, in the Dyck case, where all ; are 0 (level steps are disallowed), 
the GF H!<"1 is a function of z? only, since it takes an even number of steps to return 
to altitude 0 when starting from altitude 0. In such a case, we shall systematically 
assume that, when considering [z”]H!<"!, the index n = 2v is even. In order to 
avoid trivialities, we also assume that none of the coefficients attached to ascents and 
descents are 0. 


Theorem V.3 (Asymptotics of nested sequences). Consider the class Ta" of weighted 
Motzkin excursions of height < h. Their number satisfies a pure exponential-polynomial 
formula, 


HIS") = eB" +.0(C"), 


where B > Oand0 < C < B. In the Dyck case, it is assumed furthermore that n = 0 
(mod 2). 
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PROOF. The proof? proceeds by induction according to the depth of nesting of the 
sequence constructions. Write 


h 
a tay A), 


and let p; denote the dominant singularity of f; that is positive (existence is guaranteed 
by Pringsheim’s Theorem). 
For ease of discussion, we first examine the case where all 7; are nonzero. The 
function fo(z) is 
1 


~ y= Yh—1z° 


fo(z) 


and one has po = 1/7n,—1. The function f; is given by 


1 

fle) 1 — %,292 ~On—28n-12" folz) 

The quantity y,_2z + apn—2 Bh—127 fo(z) in its denominator increases continuously 
from 0 to +00 as z increases from 0 to pg; consequently, it crosses the value 1 at some 
point which must be /. In particular, one must have p; < po. Our assumption that 
all the +; are nonzero implies the absence of periodicities, so that p; is the unique 
dominant singularity. The argument can be repeated, implying that the sequence of 
radii is decreasing po > ~1 > p2 > ---, the corresponding poles are all simple, and 
they are uniquely dominating. The statement is thus established in the case that all the 
‘yj are nonzero. 

Dually, in the Dyck case where all the 7; are zero, one can reason in a similar 
manner, operating with the collection of “condensed” series f;(./z), which are seen 
to have a unique dominant singularity. This implies that f;(z) itself has exactly two 
dominant singularities, namely pp, and —p;,, both being simple poles. 

In the mixed case, the f; are initially of the Dyck type, till a certain y,_1_j, 4 0 
is encountered. In that case the function f;, is aperiodic (its span in the sense of 
Chapter IV is 1). The reasoning then continues like in the Motzkin case, with all the 
subsequent f; (for 7 > jo) including f,_1(z) = in. (z) having a unique dominant 
singularity. 

Similar devices yield a characterization of the profile of a random path, that is, 
the number of times a given step appears in a random excursion. 


Theorem V.4 (Profile of nested sequences). Let X,, be the random variable repre- 
senting the number of times a given step (of type aj, b;, or c;) with nonzero weight 
appears in a random excursion of length n and height < h. The moments of Xp, satisfy 


(Xn) =cqnt+ dy + O(D"), V(Xn) =cnt+ dy + O(D"), 


for constants c1,C2,d1,d2, D, with c1,cg > 0 and 0 < D < 1. In particular the 
distribution of Xy, is concentrated. 


°The present discussion is also related to the analysis of the supercritical sequence schema in the next 
section. 
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PROOF. Introduce an auxiliary variable wu with wu marking the number of designated 
steps, and form the corresponding BGF H(z, wu). We only discuss the expectation. 
The function #7 is a linear fractional transformation in u of the form 

1 
C(z) + uD(z)’ 
(The coefficients A, B, C are a priori in C(z); they are in fact computable from Propo- 
sition V.3.) Then, one has 


A(z,u) = A(z) + 


es HN, 
(C(z) + D(z)?" 


This function resembles H(z, 1)?. An application of the chain rule permits us to verify 
that indeed 


0 
Butt (4) 


u=1 


) 

aut (z,u) 
where E(z) is analytic in disc larger than the disc of analyticity of H(z,1). The 
analysis of the dominant double pole then yields the result. (The determination of the 
second moment follows along similar lines, though the computations become more 
intricate.) 


= E(z)H(z,1)?, 


u=1 


> V.15. All poles are real. Assume again a; 3;41 > Oand y; > 0. By Note 12, the denomina- 
tor polynomials Q), are reciprocals of a family of polynomials Q,, that are formally orthogonal 
with respect to a scalar product. Thus the zeros of any of the Qn, are all real, and so are the 
zeros of @;,. Consequently: The poles of the OGF of ceiled excursions H, tod are all real. (See 
for instance [457, §3.3] for the basic argument.) J 


V.3.3. Applications. Lattice paths corresponding to nested sequences have a 
quite a wide range of descriptive power, especially when weights are allowed. We 
illustrate this fact by three types of examples. 


— Example 6 provides a complete analysis of height in Dyck paths and general 
plane rooted trees, as regards moments as well as distribution. This is the 
simplest case of a continued fraction with constant coefficients attached to 
the OGF of Catalan numbers and Fibonacci-Chebyshev polynomials. 

— Example 7 discusses coin fountains. There, we are dealing with an infinite 
continued fraction to which the techniques of the previous subsection can be 
extended. The developments also takes us close to the realm of q-calculus 
and to the analysis of alcohols seen in Chapter IV. 

— Example 8 constitutes a typical application of the possibility of encoding 
combinatorial structures—here we examine interconnection networks—by 
means of lattice path weighted by integers. The enumeration involves Her- 
mite polynomials. Other examples related to set partitions and permutations 
are described in the accompanying notes. 


EXAMPLE V.6. Height of Dyck paths and plane rooted trees. In order to count lattice paths of 
the Dyck (D) or Motzkin (4) type, it suffices to effect one of the substitutions, 


OM: Aj 2, DJ HZ, CF HZ; op: aj Zz, bj z,c7, 0. 
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prawn, JOON Ney 


FIGURE V.6. Three random Dyck paths of length 2n = 500 have heights resp. 20, 31, 
24: the distribution is spread, see Proposition V.4. 


We henceforth restrict attention to the case of Dyck paths. See Figure 6 for three simulations 
suggesting that the distribution of height is somewhat spread. Given the parenthesis system 
representation (Note I.45, p.73), the height of a Dyck path automatically translates into as height 
of the corresponding plane rooted tree. 

The continued fraction expressing Ho,o results immediately from Proposition V.3 and is in 
this case periodic (here, in the sense that its stages are all alike), so that it represents a quadratic 
function, 


Hoo (2) = ——y— = gy (1- VI-@), 


Zz 222 


<5 
since Ho,o satisfies y = (1—z*y)~'. The families of polynomials P;,, Qp are in this case deter- 
mined by a recurrence with constant coefficients. Define classically the Fibonacci polynomials 
by the recurrence 


(52) Fi4o(z) = Frai(z)-— 2Fr(z), Fo(z)=0, Fi(z)=1. 


One finds Q, = Frii (2?) and P, = Fy, (z?). (The Fibonacci polynomials are reciprocals of 
Chebyshev polynomials; see Note 16.) By Proposition V.3, the GF of paths of height < h is 
then P 
[<al;,\ _ Fr") 
Aggy (2) = Frail?) 
(We get more and, for instance, the number of ways of crossing a strip of width h — 1 is 
Aes (z) = 2"~!/Fy41(z2?).) Note that the polynomials have an explicit form, 


L(h—=1)/2] 
A@= > (" a ‘ (-2)*, 


k=0 


as follows from the generating function expression: >, F),(z)y” = y/(1— y + zy”). 

The equivalence between Dyck paths and (general) plane tree traversals discussed in Chap- 
ter I implies that trees of height at most h and size n + 1 are equinumerous with Dyck paths of 
length 2n and height at most h. Set for convenience 


Fy h+1 (z ) 

F n+2(Z) , 

which is precisely the OGF of general plane trees having height < h. (This is otherwise in 
agreement with the continued fraction forms obtained directly in Chapter III: cf (52), p. 184 
and (75), p. 205.) It is possible to go much further as first shown by De Bruijn, Knuth, and Rice 


CQ. Seng le \=2 
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in a landmark paper [113], which also constitutes the historic application of Mellin transforms 
in analytic combinatorics. (We refer to this paper for historical context and references.) 
First, solving the linear recurrence (52) with z treated as a parameter yields the alternative 
closed form expression 
Cees ial ee Aas 
(53) A@=S se, G=—4 =, ¢- 4. 


There, G(z) is the OGF of all trees, and an equivalent form of G Ih] is provided by 


2 2 
where u= Pavia = a 
1+ VJ1-4z z 
as is easily verified. Thus G!"! can be expressed in terms of G(z) and z: 
a-Gh-4 = VI— 42) ga. 


j21 


(54) G=-Gl4a/ff 


U 
1—uh’ 


The Lagrange-Biirmann inversion theorem then gives after a simple calculation 


[h—2] __ 2 2n 
(55) Gnti-Gain => A € ag 1) : 


j21 


Sg oy A ay) 


Consequently, the number of trees of height > h — 1 admits a closed form: it is a “sampled” 
sum, by steps of h, of the 2nth line of Pascal’s triangle (upon taking second order differences). 

The relation (55) leads easily to the asymptotic distribution of height in random trees of 
size n. Stirling’s formula yields the Gaussian approximation of binomial numbers: for k = 
o(n®/*) and with w = k/./n, one finds 


where 


2n 
56) (Cs) went? (1 _ wt = 3u? a 5w® — 54w® + 135w* — 60w? one +) 


() 6n 360n2 


The use of the Gaussian approximation (56) inside the exact formula (55) then implies: The 
probability that a tree of size n + 1 has height at least h — 1 satisfies uniformly for h € 
lan, 3./n] (for any a, 3 such that 0 < a < 8 < co) the estimate 


Cag er! h 1 —j2a2/,.2 2 
(57) au Set -0(4)+0(4), (2) = We 2" (47222 — 2). 


The function O(z) is a “theta function” which classically arises in the theory of elliptic func- 
tions [492]. Since binomial coefficients decay fast away from the center, simple bounds also 
show that the probability of height to be at least ni/ete decays like exp(—n"*), hence is expo- 
nentially small. Note also that the probability distribution of height H itself admits of an exact 
expression obtained by differencing (55), which is reflected asymptotically by differentiation of 
the estimate of (57): 

(58) 


Po..41[H = |eVn|] =-—=0' (x) +O (=) , O(a) = Se F125? 2-8 742°). 


j21 
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FIGURE V.7. The limit density of the distribution of height —O’(z). 


The forms (57) and (58) also give access to moments of the distribution of height. We find 


Eg,,,[H"] ~ Tor (=) . where S;(y):=— 2 h’@' (hy). 


The quantity y”*'S,.(y) is a Riemann sum relative to the function —x”Q'(a), and the step 
y= n—'/? decreases to 0 as n — 00. Approximating the sum by the integral, one gets: 


EGn 41 [H"| ~ nun where [Uy := =| x” Q'(x) dx. 


The integral giving ju, is a Mellin transform in disguise (set s = r + 1) to which the treatment 
of harmonic sums applies. We then get upon replacing n + 1 by n: 


Proposition V.4. The expected height of a random plane rooted tree comprising n + 1 nodes is 


(59) Vin — + o(1). 
More generally, the moment of order r of height is asymptotic to 
(60) pn”? — where py = r(r — 1)T (r/2)C(r). 


The random variable H/,/n obeys asymptotically a Theta distribution, in the sense of both the 
“central” estimate (57) and the “local” estimate (58). The same asymptotic estimates hold for 
height of Dyck paths having length 2n. 


The improved estimate of the mean (59) is from [113]. The general form of moments 
in (60) is in fact valid for any real r (not just integers). An alternative formula for the Theta func- 
tion appears in the Note below. Figure 7 plots the limit density —O’ (a). END OF EXAMPLE V.6. 


> V.16. Height and Fibonacci-Chebyshev polynomials. The reciprocal polynomials F),(z) = 
Fp_-1(z) = z"~'F, (1/2?) are related to the classical Chebyshev polynomials by F;,(2z) = 
Un(z), where Un(cos(@)) = sin((h + 1)@)/sin(@). (This is readily verified from the recur- 
rence (52) and elementary trigonometry.) Thus, the roots of F},(z) are (4cos” ja/(h + 1))7? 
and the partial fraction expansion of G [n] (z) can be worked out explicitly [113]. There results, 
forn > 1, 


qrti 
h—-2 
(61) C= 


. 2 JT an JT 
S> sin A © A? 


1<j<h/2 
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which provides in particular an asymptotic form for any fixed h. (This formula can also be 
found directly from the sampled sum (55) by multisection of series.) Asymptotic analysis of 
this last expression when h = x,/n yields the alternative expression 


lim Po,4,[H <aV/n] = An! y-8 > pe P le (=1-O0(2)), 
which, when compared with (57), reflects an important transformation formula of elliptic func- 
tions [492]. See the study by Biane, Pitman, and Yor [52] for fascinating connections with 
Brownian motion and the functional equation of the Riemann zeta function. dq 


1l-z 


D> VI7. Motzkin paths. The OGF of Motzkin paths of height < his 2. - PA's" (4). 


where Pals refers to Dyck paths. Therefore, such paths of length n can be enumerated 
exactly by formule derived from (55-61). In particular, the mean height is ~ \/7n/3. <q 


> V.18. Height in simple varieties of trees. Consider a simple variety of trees corresponding 
to the GF equation Y(z) = z@(Y (z)) (see Chapter IID and values of n such that there exists 
a tree of size n. Assume that there exists a positive 7 strictly within the disc of convergence 
of @ such that r¢’(r) — (7) = 0. Then, the rth moment of height (H) is asymptotically 
€"/?r(r — 1)P(r/2)C(r)n"/?.. The normalized quantity H = H/€ obeys asymptotically a 
Theta distribution in the sense of both the central estimate (57) and the local estimate (58). 
[This is from [197] and [180] respectively.] For instance, € = 2 for plane binary trees and 
€ = 2 for Cayley trees. dq 


EXAMPLE V.7. Area under Dyck path and coin fountains. | Consider Dyck paths and the 
parameter equal to area below the path. Area under a lattice path is taken here as the sum of 
the indices (i.e., the starting altitudes) of all the variables that enter the standard encoding of the 
path. Thus, the BGF D(z, q) of Dyck path with z marking half-length and g marking area is 
obtained by the substitution 


ajg2z, bog, cr0 


inside the fundamental continued fraction (45). (We rederive here Equation (53) of Chapter III, 
p. 184.) It proves convenient to operate with the continued fraction 
1 


i qd 


2 
Zz 
ioe 


so that D(z, q) = F(q7'z,q’). Since F and D satisfy difference equations, for instance, 


1 
(63) FG@) = rao’ 
moments of area can be determined by differentiating and setting g = 1 (see Chapter III for 
such a direct approach). 
A general trick from q—calculus is effective for deriving an alternative form of F’. Attempt 
to express the continued fraction F' of (62) as a quotient F'(z,q) = A(z)/B(z). Then, the 
relation (63) implies 


A(z) 1 


Biz) 1 _ 774A’ 
(z) 1 92 Bigs) 


hence A(z) = B(qz), B(z) = B(qz) — qzB(q°2), 
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where q is treated as a parameter. The difference equation satisfied by B(z) is then readily 
solved by indeterminate coefficients. (This classical technique was introduced in the theory of 
integer partitions by Euler.) With B(z) = >> bn z”, the coefficients satisfy the recurrence 


bo =1, bn =Q"bn — 9G” bn-1. 


This is a first order recurrence on b,, that unwinds to give 


n2 


Sn Po ties kee UG 
PN) ee ieee a Sa. 


In other words, introducing the “g-exponential function’, 


(64) Baa) = SVE where (a = (1 a(= a?) =a"), 
n=0 i 

one finds 

(65) F(z,q) = ae 


Given the importance of the functions under discussion in various branches of mathemat- 
ics, we cannot resist a quick digression. The name of the q-exponential comes form the obvious 
property that E(z(1 — q),q) reduces to e~* as g > 1~. The explicit form (64) constitutes in 
fact the “easy half” of the proof of the celebrated Rogers-Ramanujan identities, namely, 


E(-1,q) ae i = ][a-@"*)"a-d"**)">* 
E(-4, 4) = Se q = [[a = gry a > gint8)-1, 


that relate the g-exponential to modular forms. See Andrews’ book [10, Ch. 7] for context. 

Here is finally a cute application of these ideas to the asymptotic enumeration of some 
special polyominoes. Odlyzko and Wilf define in [377, 380] an (n,m) coin fountain as an 
arrangement of n coins in rows in such a way that there are m coins in the bottom row, and 
that each coin in a higher row touches exactly two coins in the next lower row. Let Cy m be 
the number of (n,m) fountains and C'(z, q) be the corresponding BGF with g marking n and 
z marking m. Set C(q) = C(1,q). The question is to determine the total number of coin 
fountains of area n, [q”|C'(q). The series starts as (this is EJS 4005169) 


C(q) =14+4+¢ +2q° +3q1 +5q° + 9q° + 15q7 + 268 +--+ , 


as results from inspection of the first few cases. 


There is a clear bijection with Dyck paths that takes area into account: a coin fountain of 
size n with m coins on its base is equivalent to a Dyck path of length 2m and area 2n — m (with 
our ealier definition of area of Dyck paths). From this bijection, one has C(z,q) = F(z, q) 


V. 3. NESTED SEQUENCES, LATTICE PATHS, AND CONTINUED FRACTIONS. 309 


Objects Weights (aj, 3;7;) Counting Orth. pol. 
Simple paths 1,1,0 Catalan # Chebyshev 
Permutations j4+1,9,27 +1 Factorial # Laguerre 
Alternating perm. j+1,3,0 Secant # Meixner 
Involutions 1,7, 0 Odd factorial # Hermite 

Set partition ljggtl Bell # Poisson-Charlier 
Nonoverlap. set part. 1,1,7 +1 Bessel # Lommel 


FIGURE V.8. Some special families of combinatorial objects together with correspond- 
ing weights, counting sequences, and orthogonal polynomials. (See also Notes 20— 22.) 


(with F' as defined earlier) and, in particular, C(q) = F'(1, q). Consequently, 


which is (62) with z = 1. The identity (65) implies next: 
E(a.4 
(a) (49) 


E(1,q)" 

The rest of the discussion is analogous to Section IV. 7.3 (p. 269) relative to alcohols. The 
function C'(q) is a priori meromorphic in |g| < 1. An exponential lower bound of the form 
1.6” holds for [q”]C(q), since (1 — q)/(1 — q — q”) is dominated by C(q) for q > 0. At the 
same time, the number [g”]C(q) is majorized by the number of compositions, which is 2”~'. 
Thus, the radius of convergence of C'(q) has to lie somewhere between 0.5 and 0.61803 .... It 
is then easy to check by numerical analysis the existence of a simple zero of the denominator, 
E(1,q), near p = 0.57614. Routine computations based on Rouché’s theorem then make it 
possible to verify formally that p is the only pole in |g| < 3/5 and that this pole is simple (the 
process is detailed in [377]). Thus, singularity analysis of meromorphic functions applies: 


Proposition V.5. The number of coin fountains made of n coins satisfies asymptotically 
[q"]C(q) = cA” + O((5/3)"), © + 0.31236, A=p ' = 1.73566. 


This example illustrates the power of modelling by continued fractions as well as the 


smooth articulation with meromorphic function asymptotics. ..... END OF EXAMPLE V.7. 


Lattice path encodings of classical structures. The systematic theory of lattice 
path enumerations and continued fractions was developed initially because of the need 
to count weighted lattice paths, notably in the context of the analysis of dynamic data 
structures in computer science [179]. In this framework, a system of multiplicative 
weights a,;, 3;, 7; is associated with the steps a;,b;,c;, each weight being an integer 
that represents a number of “possibilities” for the corresponding step type. A sys- 
tem of weighted lattice paths has counting generating functions given by the usual 
specialization of the corresponding multivariate expressions we have just developed, 
namely, 


(67) ayo azz, by > BjzZ, C7 952, 


310 V. APPLICATIONS OF RATIONAL AND MEROMORPHIC ASYMPTOTICS 


sodewebeettn 


FIGURE V.9. An interconnection network on 2n = 12 points. 


where z marks the length of paths. One can then sometimes solve an enumeration 
problem expressible in this way by reverse-engineering the known collection of con- 
tinued fractions as found in a reference book like Wall’s treatise [489]. Next, for 
general reasons, the polynomials P,Q are always elementary variants of a family of 
orthogonal polynomials that is determined by the weights (see Note 12 and [88, 457]). 
When the multiplicities have enough structural regularity, the weighted lattice paths 
are likely to correspond to classical combinatorial objects and to classical families of 
orthogonal polynomials; see [168, 179, 238, 244] and Figure 8 for an outline. We 
illustrate this by a simple example due to Lagarias, Odlyzko, and Zagier [322], which 
is relative to involutions without fixed points. 


EXAMPLE V.8. Interconnection networks and involutions. The problem treated here has 
been introduced by Lagarias, Odlyzko, and Zagier in [322]: There are 2n points on a line, with 
n point-to-point connections between pairs of points. What is the probable behaviour of the 
width of such an interconnection network? Imagine the points to be 1,..., 2n, the connections 
as circular arcs between points, and let a vertical line sweep from left to right; width is defined 
as the maximum number of edges encountered by such a line. One may freely imagine a tunnel 
of fixed capacity (this corresponds to the width) inside which wires can be placed to connect 
points pairwise. See Figure 9. 

Let Zon be the class of all interconnection networks on 2n points, which is precisely 
the collection of ways of grouping 2n elements into n pairs, or, equivalently, the class of all 
involutions without fixed points, i.e., permutations with cycles of length 2 only. The number 
Jan equals the “odd factorial’, 


Jon =1-3-5---(2n—1), 


whose EGF is e” /? (see Chapter II, p. 113). The problem calls for determining the quantity 
J) that is the number of networks corresponding to a width < h. 

The relation to lattice paths is as follows. First, when sweeping a vertical line across a 
network, define an active arc at an abscissa as one that straddles that abscissa. Then build 
the sequence of active arcs counts at half-integer positions 3, 3, 1.2, 2n — i, 2n+ 5. This 
constitutes a sequence of integers where each member is +1 the previous one, that is, a lattice 
path without level steps. In other words, there is an ascent in the lattice path for each element 
that is smaller in its cycle and a descent otherwise. One may view ascents as associated to 
situations where a node “opens” a new cycle, while descents correspond to “closing” a cycle. 

Involutions are much more numerous than lattice paths, so that the correspondence from 
involutions to lattice paths has to be many-to-one. However, one can easily enrich lattice paths, 
so that the enriched objects are in one-to-one correspondence with involutions. Consider again 
a scanning position at a half-integer where the vertical line crosses ¢ (active) arcs. If the next 
node is of the closing type, there are ¢ possibilities to choose from. If the next node is of 
the opening type, then there is only one possibility, namely, to start a new cycle. A complete 
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FIGURE V.10. Three simulations of random networks with 2n = 1000 illustrate the 
tendency of the profile to conform to a parabola with height close to n/2 = 250. 


encoding of a network is obtained by recording additionally the sequence of the n possible 
choices corresponding to descents in the lattice path (some canonical order is fixed, for instance, 
oldest first). If we write these choices as superscripts, this means that the set of all enriched 
encodings of networks is obtained from the set of standard lattice path encodings by effecting 
the substitutions 


j 
bm So. 
k=1 


The OGF of all involutions is obtained from the generic continued fraction of Proposi- 
tion V.3 by the substitution 


ajr>z, bg jez, 
where z records the number of steps in the enriched lattice path, or equivalently, the number 


of nodes in the network. In other words, we have obtained combinatorially a formal continued 
fraction representation, 


= i 
So (1-3+++ (Qn —1))2"" = = 
amar! ‘ 1-z 
1 2-27 
cas 


which was originally discovered by Gauf8 [489]. Proposition V.3 also gives immediately the 
OGF of involutions of width at most h as a quotient of polynomials. Define 


J! (z) = os Jel", 


n>0 
One has 
1 Pr+41(z) 
J" (z) = = 
( ) 1-2? Qn41(2) 
di 2 
2-2 
1 
1—h- 2? 


where P;, and Q», satisfy the recurrence 
2 
Yn4i = Y;, —hz Yp,-1- 


The polynomials are readily determined by their generating functions that satisfies a first-order 
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linear differential equation reflecting the recurrence. In this way, the denominator polynomials 
are identified to be reciprocals of the Hermite polynomials, 


1 
Hn(z) = (22)" (=) : 
n(2) = (22)"@n (5 
themselves defined classically [2, Ch. 22] as orthogonal with respect to the measure e-® dx 
on (—oo, co) and expressible via 


Lm/2]| (—1)'m! ie ™ ont 12 
Hm(x)= >) FiGn — 21 2”) e Hm(x) | =e 
m=0 m>0 
In particular, one finds 
1 1 — 22? 1-52? 
yo ay. jhe [2] _ [3] _ &e. 
’ 1-2’ 1— 32?’ 1-622 4320 ~° 


The interesting analysis of the dominant poles of the rational GF’s, for any fixed h, is 
discussed in the paper [322]. Furthermore, simulations strongly suggest that the width of a 
random interconnection network on 2n nodes is tightly concentrated around n/2; see Figure 10. 
Louchard [341] succeeded in proving this fact and a good deal more: With high probability, 
the altitude (the altitude is defined here as the number of active arcs as time evolves) of a 
random network conforms asymptotically to a deterministic parabola 2na(1 — x) (with x € 
[0, 1]) to which are superimposed random fluctuations of a smaller amplitude, O(./7), well- 
characterized by a Gaussian process. In particular, the width of a random network of 2n nodes 
converges in probability to F. .. 1... cee eee cece eens END OF EXAMPLE V.8. 


> V.19. Bell numbers and continued fractions. With S, = n![z"]e° ~1 a Bell number: 
1 


n>0 a ee 
22? 
1—2z -—- — 
{Hint: Define an encoding like for networks, with level steps representing intermediate elements 
of blocks [168].] Refinements include Stirling partition numbers and involution numbers. <J 


> V.20. Factorial numbers and continued fractions. One has 
1 


122? 
mZy 1l-lz- ae 
1-—3z2- 


Refinements include tangent and secant numbers, as well as Stirling cycle numbers and Eulerian 
numbers. (This continued fraction is due to Euler; see [168] for a proof based on a bijection of 
Francon and Viennot [221] and Biane’s paper [51] for alternative combinatorics.) J 


> V.21. Surjection numbers and continued fractions. Let Rn = n![z"](2 — e”)~+. Then 


< fio 1 
oP? “eae 2-122? 
Pe eg 


This continued fraction is due to Flajolet [170]. J 
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[> V.22. The Ehrenfest” two-chambers model. See Note IL.11, p. 109 for context. The OGF of 
the number of evolutions that lead to chamber A full satisfies 


BL” = 1 ul} s ) 
n 7 2 — . 
a ae aN 24 T—(N —2k)z 


This results from the EGF of Note II.11, the Continued Fraction Theorem, and basic properties 
of the Laplace transform. (This continued fraction expansion is originally due to Stieltjes and 
Rogers. See [245] for additional formulz.) <i 


V.4. The supercritical sequence and its applications 


This schema is combinatorially the simplest of all the ones treated in this chapter, 
since it plainly deals with the sequence construction. An auxiliary analytic condition, 
named “supercriticality” ensures that meromorphic asymptotics applies and entails 
strong statistical regularities. This paradigm of supercritical sequences unifies the as- 
ymptotic properties of a number of seemingly different combinatorial types, including 
integer compositions, surjections, and alignments. 


V.4.1. Combinatorial aspects. We consider a sequence construction, F = SEQ(G), 
which may be taken in either the unlabelled or the labelled universe. In either case, 
we have for the corresponding generating functions the relation 


_ 1 
~ 1-Ge)’ 


with as usual G'(0) = 0. It will prove convenient to set 


fn =[2")F(Z), Gn = [2"]G(z), 
so that the number of F,, structures is f,, in the unlabelled case and n!f,, otherwise. 
From Chapter HI, the BGF of F-structures with u marking the number of G- 
components is 


F(z) 


1 
68 F ———n 
8) aH) 1— uG(z) 
We also have access to the BGF of F with u marking the number of G,-components: 
1 
(69) F)(z,u) = 


1 — (G(z) + (u— 1)ge2*)’ 


V.4.2. Analytic aspects. We restrict attention to the case where the radius of 
convergence p of G(z) is nonzero, in which case, the radius of convergence of F(z) 
is also nonzero by virtue of closure properties of analytic functions. Here is the basic 
notion of this section. 


Definition V.4. Let F,G be generating functions with nonnegative coefficients that 
are analytic at 0, with G(0) = 0. The analytic relation F(z) = (1 — G(z))~+ is said 
to be supercritical if G(p) > 1, where p = pa is the radius of convergence of G. A 
combinatorial schema F = SEQ(G) is said to be supercritical if the relation F(z) = 
(1 — G(z))~} between the corresponding generating functions is supercritical. 
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Note that G() is well defined in R U {+00} as the limit lim, _,,- G(a) since 
G(a) increases along the positive real axis, for x € (0,p). (The value G(p) corre- 
sponds to what has been denoted earlier by tg when discussing “signatures” in Sec- 
tion IV. 4, p. 236.) From now on we assume that G(z) is aperiodic in the sense that 
there does not exist an integer d > 2 such that G(z) = h(z“) for some h analytic at 0. 
Put otherwise, the span of G(z) as defined on p. 252 is equal to 1. (This condition 
entails no loss of analytic generality.) 


Theorem V.5 (Asymptotics of supercritical sequence). Let the schema F = SEQ(G) 
be supercritical and assume that G(z) is aperiodic. Then, one has 

as, 

~ oG'(c) 
where o is the root in (0, pq) of G(a) = 1 and A is a number less than 1. The 
number X of G-components in a random C-structure of size n has mean and variance 
satisfying 


[2"|F(z) -o "(1+ 0(A")), 


att) = ph wean SE cn 
V(X) = AGHA anh al a ACC yey 


o2G"(a)3 
In particular, the distribution of X on Fy, is concentrated. 


PROOF. See also [212, 443]. The basic observation is that G increases continuously 
from G(0) = 0 to G(pa) = Te (with tg > 1 by assumption) when «x increases from 
0 to pg. Therefore, the positive number oc, which satisfies G(o) = 1 is well defined. 
Then, F is analytic at all points of the interval (0,0). The function G being analytic 
at o, satisfies, in a neighbourhood of o 


Gle) = 14 C(o)(z- 0) + FO") 2 + 


so that F(z) has a pole at z = a; also, this pole is simple since G’(a) > 0, by 
positivity of the coefficients of G. Pringsheim’s theorem then implies that the radius 
of convergence of F’ must coincide with o. 

There remains to show that F(z) is meromorphic in a disc of some radius R > o 
with the point o as the only singularity inside the disc. This results from the assump- 
tion that G is aperiodic. In effect, by the Daffodil Lemma (Lemma IV.3, p. 253), one 
has |G(ce’®)| < 1 forall 9 #0 (mod 27) . Thus, by compactness, there exists a disc 
of radius R > o in which F is analytic except for a unique pole at o. Take r such 
that 0 < r < Rand apply the main theorem of meromorphic function asymptotics to 
deduce the stated formula with A = a/r. 


Consider next the number of G-components in a random F structure of size n. 
Bivariate generating functions give access to the expectation of this random variable: 


7) — 1 ae 2 : 
cs meme Als or T(E) 
1 G(z) 


= “5, C= eae 
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The problem is now reduced to extracting coefficients in a univariate generating func- 
tion with a double pole at z = o, and it suffices to expand the GF locally at 7. The 
variance calculation is similar though it involves a triple pole. 

When a sequence construction is supercritical, the number of components is in 
the mean of order n while its standard deviation is O(,/n). Thus, the distribution is 
concentrated (see Section III. 2.2, p. 150). In fact, there results from a general theorem 
of Bender [28] that the distribution of the number of components is asymptotically 
Gaussian; see Chapter IX for details. 


Profiles of supercritical sequences. We have seen in Chapter III that integer 
compositions and integer partitions, when sampled at random, tend to assume rather 
different aspects. Given a sequence construction, F = SEQ(G), the profile of an 
element a € F is the vector (X‘!), X?),...) where X)(q) is the number of G- 
components in a that have size 7. In the case of (unrestricted) integer compositions, 
it could be proved elementarily that, on average and for size n, the number of 1 sum- 
mands is ~ n/2, the number of 2 summands is ~ n/4, and so on. Now that mero- 
morphic asymptotics is available, such a property can be placed in a much wider 
perspective. 


Theorem V.6 (Profiles of supercritical sequences). Consider a supercritical sequence 
construction, F = SEQ(G), with the aperiodicity condition. The number of G- 
components of any fixed size k in a random F—object of size n satisfies 


k 

(70) En(X)) = A _—n+O(1), Vn(X) = O(n), 
aG'(c) 
where o in (0, aq) is such that G(o) = 1, and gy = [z*|G(z). 
PROOF. The bivariate GF with u marking the number of G-components of size k is 
1 

1— (G(z) + (u— Dgez*)’ 
as results from the theory developed in Chapter III. The mean value is then given by a 
quotient, 


F(z,u) = 


ex) 2am Fey) = 2pm 9 _ 
no) F(z, Me 8 7, | la—G@ 


The GF of cumulated values has a double pole at z = o, and the estimate of the mean 
value follows. The variance is estimated similarly, after two successive differentiations 
and the analysis of a triple polar singularity. 
The total number of components X satisfies X = 5+ X (k) , and, by Theorem V.5, 
its mean is asymptotic to n/(a7G’(o)). Thus, Equation (70) indicates that, at least 
in some average-value sense, the “proportion” of components of size k amongst all 
components is given by g,a*. 
> V.23. Proportion of k-components and convergence in probability. For any fixed k, the 


random variable X\*) /Xn converges in probability to the value gxo*, 


Xi p ; 7 (k) f 
“——+g,0", ie, lim P< gyro" (1-6) < oa < gro (1+e)? =1, 


n n— oo n 
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for any € > 0. The proof is an easy consequence of the Chebyshev inequalities (the distributions 
of X,, and xi) are both concentrated). <J 


V. 4.3. Applications. We examine here two types of applications of the super- 
critical sequence schema. 


— Example 9 makes explicit the asymptotic enumeration and the analysis of 
profiles of compositions, surjections and alignments. What stands out is the 
way the mean profile of a structure reflects the underlying inner construction 
& in schemas of the form SEQ(A(Z)). 

— Example 10 discusses compositions into restricted summands, including the 
striking case of compositions into primes. 


EXAMPLE V.9. Compositions, surjections, and alignments. The three classes of interest here 
are integer compositions (C), surjections (72) and alignments (QO), which are specified as 


C = SEQ(SEQs1(Z)), R = SEQ(SET>1(Z)), C = SEQ(CYc(Z)) 


and belong to either the labelled universe (C) or to the labelled universe (R and ©). The 
generating functions (of type OGF, EGF, and EGE respectively) are 


i 1 1 
C(z) = ———, R(z) = T=(e=1)' O(z) = 1 —log(1— z)-! 


A direct application of Theorem V.5 gives us the already known results 


Coa =, — Rn ~ 5(log2y-"1, = On Se (lee ols, 
corresponding to a equal to 3 log 2, and 1 — e~", respectively. 

Similarly, the expected number of summands in a random composition of the integer n 
is ~ ait The expected cardinality of the range of a random surjection whose domain has 
cardinality n is asymptotic to Bn with G = 1/(2 log 2); The expected number of components 
in a random alignment of size n is asymptotic to n/(e — 1). 

Theorem V.6 also applies and gives the mean number of components of size k in each case. 


The following table summarizes the conclusions: 


Structures Specif. 
Compositions SEQ(SEQ>;(Z)) = Geometric 
Surjections SEQ(SET>i(Z)) | = Poisson 
Alignments SEQ(CyCc(Z)) ae ~')* Logarithmic 


Note that the stated laws necessitate k > 1. The geometric and Poisson law are classical; 
the logarithmic distribution (also called “logarithmic-series distribution”) of parameter \ is by 
definition the law of a discrete random variable Y such that 


1 na 
k>1. 


a a rs (  e 


The way the internal construction & in the schema SEQ(A(Z)) determines the law of compo- 
nent sizes, 


Sequence> Geometric; Sett+ Poisson; Cyclet+ Logarithmic, 
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FIGURE V.11. Profile of structures drawn at random represented by the sizes of their 
components in sorted order: (from left to right) a random surjection, alignment, and com- 
position of size n = 100. 


stands out. Figure 11 exemplifies the phenomenon by displaying components sorted by size and 
represented by vertical segments of corresponding lengths for three randomly drawn objects of 
SIZE =100. ose haw e SAS Mea dean Vee eaeeduaiadeses ohare END OF EXAMPLE V.9. 


EXAMPLE V.10. Compositions with restricted summands, compositions into primes. Un- 
restricted integer compositions are well understood as regards enumeration: their number is 
exactly C, = 2”~', their OGF is C(z) = (1 — z)/(1 — 2z), and compositions with & sum- 
mands are enumerated by binomial coefficients. Such simple exact formule disappear when 
restricted compositions are considered, but, as we now show, asymptotics is much more robust 
to changes in specifications. 

Let S be a subset of the integers Z>1 such that gcd(S) = 1, i.e., not all members of S 
are multiples of a common divisor d > 2. In order to avoid trivialities, we also assume that 
S # {1}. The class C* of compositions with summands constrained to the set S then satisfies: 


Specification: C° = SEQ(SEQs(Z)); 
1 
OGF. O°) =— =. SS 5 2: 
O= may 8@=D 


By assumption, S(z) is aperiodic, so that Theorem V.5 applies directly. There is a well-defined 
number o such that 


S(o) = 1, 0<o<1l, 
and the number of S-restricted compositions satisfies 
1 


(71) C8 := [z"\C8(z) = ——- 0 "(1+ O(A")). 
aS'(o) 
Amongst the already discussed cases, S = {1,2} gives rise to Fibonacci numbers and, more 
generally, S = {1,...,r} corresponds to partitions with summands at most r. In this case, the 
OGF, 
1 L—z 


Ceo} (z) Ze 


1-72 7 foae peer 


is a simple variant of the OGF associated to longest runs in strings. The treatment of the latter 
can be copied almost verbatim to the effect that the largest component in a random composition 
of n is found to be lg n + O(1), both on average and with high probability. 
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10 16]15 

20 732|734 

30 36039 | 360 57 

40 1772207 | 17722 61 

50 87109263 | 871092 48 

60 4281550047 | 42815 49331 
70 210444532770 | 21044453 0095 


80 10343662267 187 | 1034366226 5182 
90 5084064 14757253 | 5084064147 81706 
100 24988932929490838 | 24988932929 612479 


FIGURE V.12. The pyramid relative to compositions into prime summands for n = 
10.. 100: (left: exact values; right: asymptotic formula rounded). 


Here is a surprising application of the general theory. Consider the case where S is taken 
to be the set of prime numbers, Prime = {2,3,5,7,11,...}, thereby defining the class of 
compositions into prime summands. The sequence starts as 


1,0,1,1,1,3, 2,6, 6,10, 16, 20, 35, 46, 72, 105, 


corresponding to G(z) = 2? + 23 + 2° +---, and is EIS A023360 in Sloane’s encyclopedia. 
The formula (71) provides the asymptotic form of the number of such compositions. It is also 
worth noting that the constants appearing in (71) are easily determined to great accuracy, as we 
now explain. 

By (71) and the preceding equation, the dominant singularity of the OGF of compositions 
into primes is the positive root o < 1 of the characteristic equation 


S(z)= ye ao 


p Prime 


Fix a threshold value mo (for instance mo = 10 or 100) and introduce the two series 


Sam OR ste= ( Ds aa 


sES, s<mo sES, s<mo 


Clearly, for x € (0,1), one has S~ (x) < S(x) < S* (zx). Define then two constants o~ ,at 
by the conditions 


Se jai Says i 0<o0 ,ot <1. 


These constants are algebraic numbers that are accessible to computation. At the same time, 
they satisfy at <o<o. As the order of truncation, mo, increases, the values of ot, a 
provide better and better approximations to 0, together with an interval in which o provably 
lies. For instance, mo = 10 is enough to determine that 0.66 < o < 0.69, and the choice 
mo = 100 gives o to 15 guaranteed digits of accuracy, namely, 9 = 0.67740 17761 30660. 
Then, the asymptotic formula (71) instantiates as 


(72) Chim’ ~g(n), gn) := 0.30365 52633 - 1.47622 87836”. 


The constant o~' + 1.47622 is akin to the family of Backhouse constants described in [165]. 
Once more, the asymptotic approximation is very good as exemplified by the pyramid of 
Figure 12. The difference between CE"° and its approximation g(n) from Eq. (72) is plotted 
on the left of Figure 13. The seemingly haphazard oscillations that manifest themselves are well 
explained by the principles discussed in Section IV.6.1 (p. 250). It appears that the next poles 
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80000 4 100000 4 
60000 4 80000 | 
40000 4 60000 4 
sega 40000 4 

|. 20000 | 

04 4 1 

20000 0 
~20000 4 

40000 4 igp66. 
60000 4 Kapaa 
80000 4 800004 
100000 4 100000 4 
120000 4 120000 4 


FIGURE V.13. Errors in the approximation of the number of compositions into primes 
for n = 70..100: left, the values of C™™° — g(n); right, the correction g2(n) arising 
from the next two poles, which are complex conjugate and the continuous extrapolation of 
this approximation. 


of the OGF are complex conjugate and lie near —0.76 + 0.447, having modulus about 0.88. 
The corresponding residues then jointly contribute a quantity of the form 


g2(n) =c- A” sin(wn + wo), A = 1.13290, 


for some constants c,w,wo. Comparing the left and right parts of Figure 13, we see that this 
next layer of poles explains quite well the residual error CE™™*® — g(n). 

Here is a final example that demonstrates in a striking way the scope of the method. Define 
the set Primez of “twinned primes” as the set of primes that belong to a twin prime pair, that is, 


p € Primeg if one of p—2, p+2 is prime. The set Primeg starts as 3, 5, 7,11, 18, 17,19, 29, 31,... 


(numbers like 23 or 37 are thus excluded). The asymptotic formula for the number of composi- 
tions of the integer n into summands that are twinned primes, is 


crnme2 0.18937 - 1.29799", 


where the constants are found by methods analogous to the case of all primes. It is quite 
remarkable that the constants involved are still computable real numbers (and of low complexity, 
even), this despite the fact that it is not known whether the set of twinned primes is finite or 
infinite. Incidentally, a sequence that starts like CR""™"°2, 


1, 0,0,1,0,1, 1, 1,2, 1,3; 4,3, 7,7, 8, 14, 15, 21, 28, 33,47, 58,... 


and coincides till index 22 included (!), but not beyond, was encountered by P. A. MacMahon!”, 
as the authors discovered, much to their astonishment, from scanning Sloane’s Encyclopedia, 
where it appears as EJS AQ02124. .......... 0. eee END OF EXAMPLE V.10. 


0See “Properties of prime numbers deduced from the calculus of symmetric functions”, Proc. London 
Math. Soc., 23 (1923), 290-316). MacMahon’s sequence corresponds to compositions into arbitrary odd 
primes, and 23 is the first such prime that is not twinned. 
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> V.24. Random generation of supercritical sequences. Let F = SEQ(G) be a supercritical 
sequence scheme. Consider a sequence of i.i.d. (independently identically distributed) random 
variables Y;, Y2,... each of them obeying the discrete law 


P(Y=k)=g,0°, k>1. 


A sequence is said to be hitting n if Yi +---+Y; = nfor some r > 1. The vector (Yi,..., ¥;) 
for a sequence conditioned to hit n has the same distribution as the sequence of the lengths of 
components in a random F-object of size n. 

For probabilists, this explains the shape of the formule in Theorem V.5, which resemble 
renewal relations [161, Sec. XIII.10]. It also implies that, given a uniform random generator for 
G-objects, one can generate a random F-object of size n in O(n) steps on average [139]. This 
applies to surjections, alignments, and compositions in particular. 


> V.25. Largest components in supercritical sequences. Let F = SEQ(G) be a supercritical 
sequence. Assume that g;, = [z*]G(z) satisfies the asymptotic “smoothness” condition 


ge pk, «pp E R50, BER. 
Then the size L of the largest G component in a random F object satisfies, for size n, 


Es, (X) (log n + Glog log n) + o(log log n). 


1 
~ Tos(p/o) 
This covers integer compositions (9 = 1, = 0) and alignments (9 = 1, 6 = —1). [The anal- 
ysis generalizes the case of longest runs in Example 2 and is based on similar principles. The 
=a 
GF of F objects with L < mis F‘” (z) = (1 Set a2") , according to Section III.7. 


For m large enough, this has a dominant singularity which is a simple pole at om such that 
om — 0 ~ c1(a/p)'™m*. There follows a double-exponential approximation 


Pr, (L << m) © exp (—conm*(o/p)"") 


in the “central” region. See Gourdon’s study [246] for details.] <q 


V.5. Paths in graphs and automata 


In this section, we develop the framework of paths in graphs: given a graph IT, 
a source node, and a destination node, the problem is to enumerate all paths from 
the source to the destination. Nonnegative weights acting multiplicatively (probabil- 
ities, multiplicities) may be attached to edges. Applications include the analysis of 
walks in various types of graphs as well as languages described by finite automata. 
Under a fundamental structural condition, known as irreducibility and correspond- 
ing to strong-connectedness of the graph, generating functions of paths all have the 
same dominant singularity, which is a simple pole. This essential property implies 
simple exponential forms for the asymptotics of coefficients (possibly tempered by 
explicit congruence conditions in the periodic case). The corresponding results can 
equivalently be formulated in terms of the set of eigenvalues (the spectrum) of the 
corresponding adjacency matrix and are related to the classical Perron-Frobenius the- 
ory of nonnegative matrices—under irreducibility, only the largest positive eigenvalue 
matters asymptotically. 


V.5. PATHS IN GRAPHS AND AUTOMATA 321 


V.5.1. Combinatorial aspects. A directed graph or digraph T is determined by 
the pair (V, ) of its vertex set V and its edge set E C V x V. Here, self loops 
corresponding to edges of the form (v,v) are allowed. Given an edge, e = (a,b), 
we denote its origin by orig(e) := a and its destination by destin(e) := b. ForT a 
digraph with vertex set identified to the set {1,...,m}, we allow each edge (a, b) to 
be weighted by a quantity gap, which we may take as a formal indeterminate, and for 
which allow ourselves to substitute positive weight values. The matrix G such that 


(73) Ga,b = Ja,p if the edge (a,b) ET, Ga.» = 0 otherwise, 


is called the weighted adjacency matrix of the (weighted) graph I’. The usual adja- 
cency matrix of I’ is obtained by the substitution gap +> 0. 

A path is a sequence of edges, w = (€1,...,€n), such that, for all 7 with 1 < 
j <n, one has destin(e;) = orig(e;+1). The parameter n is called the length of the 
path and we define: orig(@) := orig(e1), destin(@w) := destin(e,). A circuit is a 
path whose origin and destination are the same vertex. Note that, with our definition, 
a circuit has its origin that is distinguished. We do not identify here two circuits such 
that one is obtained by circular permutation from the other and also refer to circuits 
with such a root distinguished as rooted circuits. 

From the standard definition of matrix products, the powers G” have elements 
that are path polynomials. More precisely, one has the simple but essential relation, 


(74) (Gy,= So w, 


weFe? 


where Fs ) is the set of paths in I that connect 7 to 7 and have length n, and a path 
w is identified with the monomial in indeterminates {g;,;} that represents multiplica- 
tively the succession of its edges; for instance: 


(G)}, = S- 911 v2 9v2,v39v3,v4+ 


V1 =1,V2,V3,Vs=J 


In other words: powers of the matrix associated to a graph generate all paths in 
graph, the weight of a path being the product of the weights of the individual edges 
it comprises. (This fact probably constitutes the most basic result of algebraic graph 
theory [53, p. 9].) One may then treat simultaneously all lengths of paths (and all 
powers of matrices) by introducing the variable z to record length. 

Proposition V.6. (i) Let T be a digraph and let G be the formal adjacency matrix 
of Tas given by (73). The OGF F‘“3) (z) of the set of all paths from i to j inT, with 
z marking length and gq,» the weight associated to edge (a, b), is the entry i, j of the 
matrix (I — zG)~1, namely 


AI) (z) 


(9) (y) Sa, ta 
(75) STG GEG Nae UN egy x4 


a,j 


where A(z) = det(I — zG) is the reciprocal polynomial of the characteristic poly- 
nomial of G and A (z) is the determinant of the minor of index j,i of I — zG. 
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(it) The generating function of (rooted) circuits is expressible in terms of a loga- 
rithmic derivative, 


(76) So (FP) (2) -1) = -2 NOR 


a 


In this algebraic statement, if one takes the {ga,»} as formal indeterminates, then 
FJ) (z) is a multivariate GF of paths in z with the variable {g,,} marking the num- 
ber of occurrences of edge (a, b). The result specializes to the case where the gq, are 
assigned numerical values, in which case [z"]F'“4)(z) becomes the total weight of 
paths of length n, which we also refer to as “number of paths” in the weighted graph. 
PROOF. For the proof, it is convenient to assume that the quantities g,,, are assigned 
arbitrary real numbers, so that usual matrix operations (triangularization, diagonaliza- 
tion, and so on) can be easily applied. Since the properties expressed by the statement 
are utimately equivalent to a collection of multivariate polynomial identities, their 
general validity is implied by the fact that they hold for all real assignments of values. 
Part (2) results from the fundamental equivalence between paths and matrix prod- 
ucts (74), which implies 
Co 
FA 2) = 2"(6"),5 = (@-26)") 
n=0 
and from the cofactor formula of matrix inversion. 
Part (72) results from elementary properties of the matrix trace!! functional. With m 


the dimension of G and {\j,..., Am} the multiset of its eigenvalues, we have 
(77) So Fi) = Tra? = S00, 
i=1 j=1 
where F\)) = [2”| F(z). Upon taking a generating function, there results that 
(78) SS 3 Pt yn — 5 _Aiz 
i=1 n=1 . j=l 1— jz’ 


which, up to a factor of —z, is none other than the logarithmic derivative of A(z). 


> V.26. Positivity of inverses of characteristic polynomials. Let G have nonnegative coeffi- 
cients. Then, the rational function Z@(z) := 1/det(J — zG) has nonnegative Taylor coef- 
ficients. More generally, if G = (ga,,) is a matrix in the formal indeterminates g,,,, then 
[z”]Zaq(z) is a polynomial in the ga,, with nonnegative coefficients. (Hint: The proof proceeds 
by integration from (76): we have, for 1/A(z), the equivalent expressions 


ag eoo(- f Poa) -eo(f Sew 4) -ox ene 


which ensure positivity of the coefficients of ZG.) <q 


‘tf His an m x m matrix with multiset of eigenvalues {j11,..., {4m}, the trace is defined by 
Tr Ho := 3072, (H)ai and, by triangularization (Jordan form), it satisfies Tr H = 377") p43. 
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[> V.27. MacMahon’s Master Theorem. Let J be the determinant 


1l—-z1911 —Zz2g12 °°: —ZmGim 
—21921 1-—2z22go2 -:: —ZmG2m 
J(z1, ri 5 Zm) => 
—ZmQJmi 2m gG2m pata 1- ZmJmm 
MacMahon’s “Master Theorem” asserts the identity of coefficients, 
1 
ay am ay am ay am 

By t+ Se | ——_£_— = [2)' --- en lyyr¢ ey, where Yj; = eae 
[zy me Ga ea [24 m” | 1 mi 3 J dH 


This result can be obtained by a simple change of variables in a multivariate Cauchy integral and 
is related to multivariate Lagrange inversion [244, pp. 21-23]. Cartier and Foata [80] provide a 
general combinatorial interpretation related to trace monoids of Note 6, p. 285. 

> V.28. The Jacobi trace formula. Jacobi’s trace formula [244, p. 11] for square matrices is 
(79) det o exp(M) = expo Tr(M) 

or equivalently, with due care paid to determinations, logo det(M) = Trolog(M)) which 
generalizes the scalar identities ee? = e**° and logab = loga + log b. (Hint: recycle the 
computations of Note 26.) dq 


> V.29. Fast computation of the characteristic polynomial. The following algorithm is due 
to Leverrier (1811-1877), the astronomer and mathematician who, together with Adams, first 
predicted the position of the planet Neptune. Since, by (77) and (78), one has 
n n ~ Az 
Sener =r, 


n>1 j=l 


it is possible to deduce an algorithm that determines the characteristic polynomial of a matrix 
of dimension m in O(m*) arithmetic operations. [Hint: computing the quantities Tr G? for 
j =1,...,mis sufficient and requires precisely m matrix multiplications. ] <q 


> V.30. The matrix tree theorem. Let I’ be a directed graph without loops and associated 
matrix G, with ga, the weight of edge (a,b). The Laplacian matrix L[G] is defined by 


L[G);,3 = —-Gi,j + [i => j15:, where 0; = pce 
k 


Let L;[G] be the matrix obtained by deleting the first row and first column of L[G]. Then, the 
“tree polynomial” 

T,[G] := det Li[G] 
enumerates all (oriented) spanning trees of I’ rooted at node 1. [This classic result belongs to a 
circle of ideas initiated by Kirchhoff, Sylvester, Borchardt and others in the 19th century. See, 
e.g., the discussions by Knuth [306, p. 582-583] and Moon [364].] <q 


Weighted graphs, word models, and finite automata. The numeric substitution 
7: Ya,p +> 1 transforms the formal adjacency matrix G of T into the usual adjacency 
matrix. In particular, the number of paths of length n is obtained, under this substitu- 
tion, as [z"](1 — zG)~1. As already noted, it is possible to consider weighted graphs, 
where the g,,p are assigned positive real-valued weights; with the weight of a path be- 
ing defined by the product of its edges weights. One finds that [2”"](I — zG)~+ equals 
the total weight of all paths of length n. If furthermore the assignment is made in 
such a way that )> + Ja,b = 1, for all a, then the matrix G, which is called a stochastic 
matrix, can be interpreted as the transition matrix of a Markov chain. Naturally, the 
formulae of Proposition V.6 
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Word problems corresponding to regular languages can be treated by the theory 
of regular specifications whenever they have enough structure and an unambiguous 
regular expression description is of tractable form. (This is the main theme that has 
been pursued in Sections V.2 and V.3.) The dual point of view of automata theory 
introduced in Section I. 4.2 (p. 52) proves useful whenever no such direct description 
is in sight. Finite automata can resort to the theory of paths in graphs, so that Proposi- 
tion V.6 is applicable to them. Indeed, the language £ accepted by a finite automaton 
A, with set of states Q, initial state qo, and Q; the set of final states, decomposes as 


f= S- F 40-4, 


qeQy 


where F‘%-4 is the set of path from the initial state go to one of the final state, q. 
(The corresponding graph [ is obtained from A by collapsing multiple edges between 
any two vertices, 2 and 7, into a single edge equipped with a weight that is the sum 
of the weights of all the letters leading from 7 to 7.) Proposition V.6 is then clearly 
applicable. 


Profiles. By profile of a set of paths is meant here the collection of the m? statis- 
tics N = (Ni1,...,Nm,m) where N;,; is the number of times the edge (¢ — j) is 
traversed. This notion is for instance consistent with the notion of profile given earlier 
for lattice paths in Section V.3. It also contains the information regarding the letter 
composition of words in a regular language and is thus compatible with the notion of 
profile introduced in Section V. 2. 

Let T' be a graph with edge (a, b) weighted by yq,,. Then, the BGF of paths with 
u marking the number of times a particular edge (c, d) is traversed is in matrix form 


(P= 26), with G=G [905 ga,oulle=a] 


The entry (7, 7) in this matrix gives the BGF of paths with origin i and destination j. 
The GF of cumulated values (moments of order 1) is then obtained from there in the 
usual way, by differentiation followed by the substitution u = 1. Higher moments are 
similarly attainable by successive differentiations. 


V.5.2. Analytic aspects. In full generality, the components of a linear system 
of equations may exhibit the whole variety of behaviours obtained for the OGFs of 
regular languages in Section V.2, p. 278. However, positivity coupled with some 
simple ancillary conditions (irreducibility and aperiodicity defined below) entails that 
the GFs of interest closely resemble the extremely simple rational function, 


es ot 
l—z/p) 1-2’ 


where p is the dominant positive singularity and A; = 1/p is a well-characterized 
eigenvalue of JT. Accordingly, the asymptotic phenomena associated with such sys- 
tems are highly predictable and coefficients are of the pure exponential form c- p~”. 
We propose to expose here the general theory and treat in the next section classical ap- 


plications to statistics of paths in graphs and languages recognized by finite automata. 
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FIGURE V.14. Irreducibility conditions. Left: a strongly connected digraph. Right: a 
weakly connected digraph that is not strongly connected is a collection of strongly con- 
nected components linked by a directed acyclic graph. 


Irreducibility and aperiodicity of matrices and graphs. From this point on in 
this section, we consider matrices with nonnegative entries. Two notions are essential, 
irreducibility and aperiodicity (the terms are borrowed from Markov chain theory and 
matrix theory). 

For A a scalar matrix of dimension m x m (with nonnegative entries), a crucial 
role is played by the dependency graph; this is the (directed) graph with vertex set 
V = {1..m} and edge set containing the directed edge (a — b) iff Aa, 4 0. The 
reason for this terminology is the following: Let A represent the linear transformation 


{ut = j Aig yj \ ; then, the fact that an entry A;_; is nonzero means that y* depends 
av 


effectively on y; and is translated by the directed edge (i — j) in the dependency 
graph. 

Definition V.5. The nonnegative matrix A is called irreducible if its dependency graph 
is strongly connected (i.e., any two vertices are connected by a directed path). 


By considering only simple paths, it is then seen that irreducibility is equivalent 
to the condition that (J+A) has all its entries that are strictly positive. See Figure 14 
for a graphical rendering of irreducibility and for the general structure of a (weakly 
connected) digraph. 


Definition V.6. A strongly connected digraph T is said to be periodic with parameter 
d iff the vertex set V can be partitioned into d classes, V = Vo U---U Va-1, in sucha 
way that any edge whose source is an element of a V; has its destination in Vj+1 mod d: 

The largest possible d is called the period. Ifno decomposition exists with d > 2, 
so that the period has the trivial value 1, then the graph and all the matrices that admit 
it as their dependency graph are called aperiodic. 


For instance, a directed 10-cycle is periodic with parameters d = 1, 2,5, 10 and 
the period is 10. Figure 15 illustrates the notion. Periodicity implies that the existence 
of paths of length n between any two given nodes 2, 7 is constrained by the congruence 
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FIGURE V.15. Periodicity notions: the overall structure of a periodic graph with d = 4 
(left), an aperiodic graph (middle) and a periodic graph of period 2 (right). 


class n mod d. Conversely, aperiodicity entails the existence, for all n sufficiently 
large, of paths of length n connecting 3, 7. 

From the definition, a matrix A with period d has, up to simultaneous permutation 
of its rows and columns, a cyclic block structure 


[4o1] 0 ics. 
0 [Ar2] --- 0 


ee Ere 


where the blocks A;;+1 are reflexes of the connectivity between V; and Vj41. In 
the case of a period d, the matrix A% admits a diagonal square block decomposition 
where each of its diagonal block is aperiodic (and of a smaller dimension than the 
original matrix). Then, the matrices A”? can be analysed block by block, and the 
analysis reduces to the aperiodic case. Similarly for powers A”¢+" for any fixed r as 
vy varies. In other words, the irreducible periodic case with period d > 2 can always 
be reduced to a collection of d irreducible aperiodic subproblems. For this reason, we 
usually postulate in our statements both an irreducibility condition and an aperiodicity 
condition. 


> V.31. Sufficient conditions for aperiodicity. Any one of the following conditions suffices to 
guarantee aperiodicity of the nonnegative matrix T’: 
(t) T has (strictly) positive entries; 
(ti) some power T° has (strictly) positive entries; 
(tit) T is irreducible and at least one diagonal element of T is nonzero; 
(iv) T is irreducible and the dependency graph of T is such that there exist two circuits 
(closed paths) that are of relatively prime lengths. 


(Any such condition implies in turn the existence of a unique dominant eigenvalue of JT’, which 
is simple, according to Theorem V.7 and Note 34 below.) <q 


> V.32. Computability of the period. There exists a polynomial time algorithm that determines 
the period of a matrix. (Hint: in order to verify that I is periodic with parameter d, develop a 
breadth-first search tree, label nodes by their level, and check that edges satisfy suitable con- 
gruence conditions modulo d.) <q 
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Paths in strongly connected graphs. For analytic combinatorics, the importance 
of irreducibility and aperiodicity conditions stems from the fact that they guarantee 
uniqueness and simplicity of a dominant pole of path generating functions. 


Theorem V.7 (Asymptotics of paths in graphs). Consider the matrix 
F(z) =(I- 27), 
where T is a scalar nonnegative matrix, in particular, the adjacency matrix of a graph 
T equipped with positive weights. Assume that T is irreducible. Then all entries 
F“3) (z) of F(z) have the same radius of convergence p, which can be defined in two 
equivalent ways: 
(i) as p = X{* with A, the largest positive eigenvalue of T; 

(it) as the smallest positive root of the determinantal equation: det(I—zT) = 0. 
Furthermore, the point p = Oe is a simple pole of each F\“)) (z). 

IfT is irreducible and aperiodic, then p = dee is the unique dominant singularity 
of each FJ) (z), and 

[22] F@) (z) = pigAP+O(A"), O<SA<A, 
for computable constants pj; > 0. 
PROOF. The proof proceeds by stages, building up properties of the F7) by means 
of the relations that bind them, with suitable exploitation of Proposition V.6, p. 321 in 
conjunction with Pringsheim’s Theorem (p. 229). In Parts(i)—(v), we assume that the 
matrix T is aperiodic. Periodicity is finally examined in Part (v7). 

(i) All F“)) have the same radius of convergence. Simple upper and lower 
bounds show that each F'*J) has a finite nonzero radius of convergence pi. By 
Pringsheim’s Theorem, this p;,; is necessarily a singularity of the function Fd), 
Since each F'*J) is a rational function, it then has a pole at p;,;, hence becomes infi- 
nite as z — p;,;. Now, the matrix F satisfies the identities 


(80) F=14+2TF, and F=142FT. 


Thus, given that T is irreducible, each F'-) is positively linearly related to any other 
PRS Thus, the F‘*3) must all become infinite as soon as one of them does. Conse- 
quently, all the p;,; are equal—we let p denote their common value. 


(it) All poles are of the same multiplicity. By a similar argument, we see that all 
the FJ) must have the same multiplicity « of their common pole p, since otherwise, 
one function would be of slower growth, and a contradiction would result with the 
linear relations stemming from (80). We thus have, for some y;,; > 0: 

Pes (z) ~ _ Pig 
» U—2/p)* 

(itt) The common multiplicity of poles is & = 1. This property results from 
the expression of the GF of all rooted circuits (Proposition V.6, Part (iz)) in terms of a 
logarithmic derivative, which has by construction only simple poles. Hence, a positive 
linear combination of some of the FJ) has only a simple pole, so that « = 1 and 


81 BEN Gy 
( ) (z) zp 1 2S z/p 
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Another consequence is that we have p = 1/A1, where , is an eigenvalue of matrix 
T, which then satisfies the property that A; > |A| for any eigenvalue \ of T: in matrix 
theory terminology, such an eigenvalue is called “dominant”. 


(iv) There are positive dominant eigenvectors. From the relations (80) satisfied 
by the F\))(z) with 7 fixed and from (81), one finds as z > p 


Pi,5 tisk Pk, 
(82) ——~p —_——_, where JT = (T;;). 
1—2z/p Diep Tis) 


This expresses the fact that the column vector (1,;,..-, Pm,;)‘ is a right eigenvector 
corresponding to the eigenvalue 4; = p~‘. Similarly, for each fixed i, the row vec- 
tor (i,1,---;Yi,m) is found to be a left eigenvector. By Part (77), these eigenvector 
have all their components strictly positive. 


(v) The eigenvalue d, is simple. This property is needed in order to identify the 
i,j coefficients. We base our proof on the Jordan normal form and simple inequali- 
ties. 

Assume first that there are two different Jordan blocks corresponding to the eigen- 
value \;. Then there exist two vectors, v = (v1,...,Um)’ and w = (wy,..-,Wm)*, 
such that 

Tv = 1, Tw = A,u, 


where we may assume that the eigenvector v has positive coordinates, given Part (iv). 
Let jo be an index such that 


By possibly changing w to —w and by rescaling, we may freely assume that w;, = 
vj). Also, since v and w are not collinear, there must exist j; such that |w,,| < v;,. 
In summary: 


(83) Wyo = Vjos wy, | < Uj, Vj: |w,| S U;. 


Consider finally the two relations Tv = Aj’v and T™w = A7w, and examine 
consequences for the 79 components. One has 


m m 
(84) Vig = S Ujo,kVk: Wig = S /Uso,k Wk 
k=1 k=1 


where each U;,,, the (j,k) entry of T’”, is positive, by the irreducibility and ape- 
riodicity assumptions. But then, by the triangle inequality, there is a contradiction 
between (84) and (83). Thus, there cannot be two distinct Jordan blocks correspond- 
ing to Aj. 

There only remains to exclude the existence of a Jordan block of dimension > 2 
associated to \,. If such a Jordan block were present, there would exists a vector w 


1214 matrix theory, a dominant eigenvalue (A1) is one that is Jargest in modulus, while, for an analytic 
function, a dominant singularity (p) is one that is smallest in modulus. The two notions are reconciled by 
the fact that here singularities are inverses of eigenvalues (9 = 1/A1). 
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such that 
(85) = A\1w T’v™y = rxN™uw, 


Tw = rAwtv implying es = N™wt+ ymrx* 9. 


By simple bounds obtained from comparing w to v componentwise, it is found that 
the vector T”’"w must have all its coordinates that are O(A{). Upon taking v — oo, 
a contradiction is reached with the last relation of (85), where the growth of these 
coordinates is of the form vA. Thus, a Jordan block of dimension > 2 is also 
excluded, and the eigenvalue 2, is simple. 


(vt) Aperiodicity of T is equivalent to the existence of a uniquely dominant eigen- 
value. If 4; uniquely dominates, meaning that A; > |A| for all eigenvalues \ 4 4, 
then each F'*J) has a simple pole at p that is its unique dominant singularity. Hence 
the coefficients [z"]F'“4) (z) are nonzero for n large enough, since they are asymp- 
totic to y;,;” by (81). This last property ensures aperiodicity. 

Conversely, if Tis aperiodic, then A; uniquely dominates. Indeed, suppose that 
be an eigenvalue of T such that || = 1, with w a corresponding eigenvector. We 
would have Tv = Aj’v and T”™w = pw” w. But then, by an argument similar to the 
one used in Part (v), upon making use of inequalities (83), we would need to have w 
and v collinear, which is absurd. 

We leave it as an exercise to the reader to verify the stronger property that identi- 
fies the period with the number of dominant eigenvalues: see Note 33. 


Several of these arguments will inspire the discussion, in Chapter VII, of the 
harder problem of analysing coefficients of algebraic functions defined by positive 
polynomial systems (Subsection VII. 6.3, p. 464). 


> V.33. Periodicities. If T has period d, then the support of each FJ) (z) is included in dZ, 
hence there are at least d conjugate singularities, corresponding to eigenvalues of the form 
At e2kt/4 There are no other eigenvalues since T* is built out of irreducible blocks, each with 
the unique dominant eigenvalue \¢. dq 


> V.34. The classical Perron-Frobenius Theorem. The proof of Theorem V.7 immediately 
gives: 

Theorem. Let A be a matrix with nonnegative elements that is assumed to be irreducible. 
The eigenvalues of A can be ordered in such a way that 


Ar = |A2| = +++ = |Aal > [Aatal 2 |Aatal 2° 
and all the eigenvalues of largest modulus are simple. Furthermore, the quantity d is precisely 
equal to the period of the dependency graph. In particular, in the aperiodic case d = 1, there 


is unicity of the dominant eigenvalue. In the periodic case d > 2, the whole spectrum has a 
rotational symmetry: it is invariant under the set of transformations 


Mee. gO ES 
The properties of positive and of nonnegative matrices have been superbly elicited by Per- 
ron [392] in 1907 and by Frobenius [222] in 1908-1912. The corresponding theory has far- 
reaching implications: it lies at the basis of the theory of finite Markov chains and it extends 
to positive operators in infinite-dimensional spaces [318]. Excellent treatments of Perron- 
Frobenius theory are to be found in the books of Bellman [26, Ch. 16], Gantmacher [225, 
Ch. 13], as well as Karlin and Taylor [290, p. 536-551]. J 
> V.35. Unrooted circuits. Consider a strongly connected weighted graph [’ with adjacency 


matrix G = (gi,;). Let RC be the class of all rooted circuits and PRC the subclass of those 
that are primitive (i.e., they differ from all their cyclic shifts). Let also U/C be the class of all 
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unrooted circuits (no origin distinguished) and PUC the subclass of those that are primitive. 
Define the adjacency matrix G°* := ((gi,;)*) obtained by raising each entry of G to the sth 
power. Set finally Ag(z) := det(I — zG). We find 


RC(z,G) = > PRC(z*,G°"), PUC(z,G) = [ pRcw.e) S. 
k>1 0 

UC(z,G) = >> PUuCc(z*,G®), 
k>1 


upon mimicking the reasoning of APPENDIX A: Cycle construction, p. 674. This results in 


UC(z) = oa ae) log Agor(z), 


k 
ewe) = 40a), — e"|PuC(z) = = + 014%), 


where the two asymptotic estimates hold under irreducibility and aperiodicity conditions. These 
estimates can be regarded as a Prime Number Theorem for walks in graphs. (See [450] for 
related facts and zeta functions of graphs.) 


Profiles. The proof of Theorem V.7 provides the form of a certain “residue ma- 
trix”, from which several probabilistic properties of paths follow. 


Lemma V.1 (Iteration of irreducible matrices). Let the nonnegative matrix T be ir- 
reducible and aperiodic, with \, its dominant eigenvalue. Then the residue matrix ® 
such that 


® 
(86) (I—2zT)-+ = —_ +00) (¢—A;,") 
1- zr 
has entries given by ((x, y) represents the scalar product >>, x;y) 
rie; 
Pi Te 


where r and ¢ are respectively right and left eigenvectors of T corresponding to the 
eigenvalue 4. 


PROOF. We have seen that the matrix 6 = (y;,;) has its rows and columns respec- 
tively proportional to right and left eigenvectors belonging to the eigenvalue A;. Thus, 
we have 

FES 

1,9 P11 
while the ~1,; (respectively, y;,1) are the coordinates of a left (respectively, right) 


eigenvector. There results that there exists a normalization constant € such that 


Pig = Erilj. 
That normalization constant is then determined by the fact that GF of circuits has 
residue equal to p = pe at z = p, so that 57, Y;,; = 1, leading to 


l= ES ribs, 
j 


which implies the statement. 
Equipped with the lemma, we can now state: 
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Theorem V.8 (Profiles of paths in graphs). Let G be a nonnegative matrix associ- 
ated to a weighted digraph T, assumed to be irreducible and aperiodic. Let ¢,r be 
respectively the left and right eigenvectors corresponding to the dominant (Perron- 
Frobenius) eigenvalue 1. Consider the collection F\“) of (weighted) paths in T 
with fixed origin a and final destination b. Then, the number of traversals of edge 
(s, t) in a random element of Fa”? has mean 


l59s,t1t 

(67) - 
In other words, a long random path tends to spend asymptotically a fixed (nonzero) 
fraction of its time traversing any given edge. Accordingly, the number of visits to 
vertex s is also proportional to n and obtained by summing the expression of (87) 
according to all the possible values of t. 
PROOF. First, the total weight (“number”) of paths in F;, , satisfies 


(87) Ts4n + O(1) where Ts4:= 


= Talo 
88 "\[2-2G)"]| ~~ 
( ) [z ] [( z ) lees (é,r)’ 
as follows from Lemma V.1. Next, introduce the modified matrix H = (h,,;) defined 
by 
hip gigul-—1, 

In other words, we mark each traversal of edge 2,7 by the variable u. Then, the 
quantity 

| a,b 


represents the total number of traversals of edge (s,t), with weights taken into ac- 
count. Simple algebra!’ shows that 


(89) [2"] lau — 7H) 


0 
90 —(I-zH) 
(90) af ~ 2) 
where H’ := (0,,H),,_, has all its entries equal to 0, except for the s,t entry whose 
value is gs. By the calculation of the residue matrix in Lemma V.1, the coefficient 
of (89) is then asymptotic to 


= (I— zG)~1 (zH') (I - zG), 


u=1 


n Ya,s Pt,b rals9s tree 
91 E SS 
@1) i 29 PO Nay (é,r)? 
Comparison of (91) and (88) finally yields the result since the relative error terms are 
O(n~*) in each case. 
Another consequence of this last proof and Equation (88) is that the numbers of 
paths starting at a and ending at either 0 or c satisfy 


~ unary, Vix 


(a,b) 
FR £ 
(92) lim > = ae 
n—0o Fyv’ Lo 


I3tf A is an operator depending on wu, one has 0,,(A—!) = —A~1(0,,A)A7!, which is a noncom- 
mutative generalization of the usual differentiation rule for inverses. 
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In other words, the quantity 
Ly 


> j b; 
is the asymptotic probability that a random path with origin fixed at some point a but 
otherwise unconstrained will end up at point b after a large number of steps. Such 
properties are strongly evocative of Markov chain theory discussed below in Exam- 
ple 13. 


> V.36. Residues and projections. Let E = C”™ be the ambient space, where m is the dimen- 
sion of T’, assumed to be irreducible and aperiodic. There exists a direct sum decomposition 
€ = F, + Fe where F; is the 1-dimensional eigenspace generated by the eigenvector (r) 
corresponding to eigenvalue A; and F2 is the supplementary space which is the direct sum of 
characteristic spaces corresponding to the other eigenvalues A2,.... (For the purposes of the 
present discussion, one may freely think of the matrix as diagonalizable, with F2 the union of 
eigenspaces associated to A2,....) Then 7 as a linear operator acting on F admits the decom- 
position 
T=AP+S, 

where P is the projector on F; and S acts on F2 with spectral radius |A2|, as illustrated by the 
diagram: 


Ne 


(93) 


Spe 


By standard properties of projections, P? = Pand PS = SP = Oso that T” = AV P+S". 
Consequently, there holds, 


-1_ nyn nan, __ P _ -1 
(94) (f=27) = 28 AP + 27S") = i=age 28)". 


Thus, the residue matrix ® coincides with the projector P. 
From there, one finds also 


= ® -1\k k =f oy =k=1 
(95) (I—2T) eevee ae 5. RESP ESay sy 
k>0 
which provides a full expansion. dq 


> V.37. Algebraicity of the residues. One only needs to solve one polynomial equation in order 
to determine \,;. Then the entries of ® and the R; in (95) are all obtained by rational operations 
in the field generated by the entries of T extended by the algebraic quantity 1: for instance, in 
order to get an eigenvector, it suffices to replace one of the equations of the system Tr = Air 
by a normalization condition, like r; + ---+ 17m = 1. (Numerical procedures are likely to be 
used instead for large matrices.) dq 

Automata and words. By proposition V.6 (p. 321), the OGF of the language de- 
fined by a deterministic finite automaton is expressible in terms of the quasi-inverse 
(1 — zT)~+, where the matrix T is a direct encoding of the automaton’s transitions. 
Corollary V.7 and Lemma V.1 have been precisely custom-tailored for this situation. 
As is by now usual, we shall allow weights on letters of the alphabet, corresponding to 
a Bernoulli model on words. We say that an automaton is irreducible (resp. aperiodic) 
if the underlying graph and the associated matrix are irreducible (resp. aperiodic). 
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Proposition V.7 (Random words and automata). Let £ be a language recognized by 
a deterministic finite automaton A whose graph is irreducible and aperiodic. The 
number of words of L satisfies 


Lier 2 OK"), 


where A, is the dominant (Perron-Frobenius) eigenvalue of the transition matrix of A 
and c, \ are real constants with c > Oand0 < A < 4. 

Ina random word of Ly, the number of traversals of a designated vertex or edge 
has a mean that is asymptotically linear in n and is given by Theorem V8. 


> V.38. Unambiguous automata. A nondeterministic finite state automaton is said to be un- 
ambiguous if the set of accepting paths for any given words comprises at most one element. 
The translation into generating function as described above also applies to such automata, even 
though they are nondeterministic. dq 


> V.39. Concentration of distribution for the number of passages. Under the conditions of 
the theorem, the standard deviation of the number of traversals of a designated node or edge 
is O(,/n). Thus in a random long path, the distribution of the number of such traversals is 
concentrated. [Compared to (90), the calculation of the second moment requires taking a further 
derivative, which leads to a triple pole. The second moment and the square of the mean, which 
are each O(n”), are then found to cancel to main asymptotic order. ] <q 


V.5.3. Applications. We now provide a few application of Theorems V.7 and V.8. 


— First, two simple applications are discussed. Example 11 studies briefly the 
case of words that are locally constrained in the sense that certain transitions 
between letters are forbidden. Example 12 revisits walks on an interval and 
develops an alternative matrix view of a problem otherwise amenable to 
continued fraction theory. 

— Example 13 makes explicit the way the fundamental theorem of finite Markov 
chain theory can be derived effortlessly as a consequence of the more gen- 
eral Theorem V.8. Example 14 compares on a simple problem, the devil’s 
staircase, the combinatorial and the Markovian approaches. 

— Example 15 comes back to words and develops simple consequence of an 
important combinatorial construction, that of De Bruijn graphs. This graph 
is precious in predicting in many cases the shape of the asymptotic results 
that are to be expected when confronted with word problems. Example 16 
concludes this section with a brief discussion of special case of words with 
excluded patterns, thereby leading to a quantitative version of Borges’ The- 
orem (Note I.32, p. 58). 


In all these examples, the counting estimates are of the form cAj, while the expecta- 
tions of parameters of interest have a linear growth. 


EXAMPLE V.11. Locally constrained words. Consider a fixed alphabet A = {a1,...,am} 
and a set F C A? of forbidden transitions between consecutive letters. The set of words 
over A with no such forbidden transition is denoted by £ and is called a locally constrained 
language. (The particular case where exactly all pairs of equal letters are forbidden corresponds 
to Smirnov words and has been discussed on p. 249.) 
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1100 
1011 
1000 
0011 


FIGURE V.16. Locally constrained words: The transition matrix (7’) associated to the 
forbidden pairs F’' = {ac, ad, bb, cb, cc, cd, da, db}, the corresponding automaton, and the 
graph with widths of vertices and edges drawn in proportion to their asymptotic frequen- 
cies. 


Clearly, the words of £ are recognized by an automaton whose state space is isomorphic 
to A: state q simply memorizes the fact that the last letter read was a g. The graph of the au- 
tomaton is then obtained by the collection of allowed transitions (q,1r) +> a, with (q,r) ¢ F. 
(In other word, the graph of the automaton is the complete graph in which all edges that corre- 
spond to forbidden transitions are deleted.) Consequently, the OGF of any locally constrained 
language is a rational function. Its OGF is given by 


(1,1,...,)(—2T)7'(1,1,...,1)', 


where 7); is 0 if (ai,a;) € F and 1 otherwise. If each letter can follow any other letter in an 
accepted word, the automaton is irreducible. The graph is aperiodic except in a few degenerate 
cases (e.g., in the case where the allowed transitions would be a b,c,b + d,c—d,d— a). 
Under irreducibility and aperiodicity, the number of words will be ~ cA{ and each letter will 
have on average an asymptotic constant frequency. (See (34) and (35) of Chapter IV for the 
case of Smirnov words.) 

For the example of Figure 16, the alphabet is A = {a, b,c, d}. There are eight forbidden 
transitions and the characteristic polynomial xq@(A) := det(AI — G) is found to be A?(\ — 2). 
Thus, one has A; = 2. The right and left eigenvectors are found to be 


r=2.2,40, -2=(@)1,1,4). 


Then, the matrix 7, where 7;,4 represents the asymptotic frequency of transitions from letter s 
to letter ¢, is found in accordance with Theorem V.8: 


i 

ae 
ed ee Te 6 
+ 0 0 0 
00s + 


This means that a random path spends a proportion equal to + of its time on a transition between 
an a and a 6, but much less (+) on transitions between pairs of letters bc, bd, cc, ca. The letter 
frequencies in a random word of CL are (3, + = =)s so that an a is four times more frequent 
than a c ora d, and so on. See Figure 16 (right) for a rendering. 

Various specializations, including multivariate GF’s and nonuniform letter models are 
readily treated by this method. Bertoni et al. develop in [49] related variance and distribu- 
tion calculations in the case of the number of occurrences of a symbol in an arbitrary regular 
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EXAMPLE V.12. Walks on the interval revisited. As a direct illustration, consider the walks 
associated to the graph I'(5) with vertex set 1,...,5 and edges being formed of all pairs (7, 7) 
such that |i — j| < 1. The matrix is 


11000 
11100 
G(5)=] 01110 
00111 
00011 


— 


The characteristic polynomial yq@(z) := det(zI — G) factorizes as 


Xa@(s)(2) = 2(2 — 1)(z — 2)(2” — 22 — 2), 


and its dominant root is Ay = 1+ V3. From there, one finds a left eigenvector (which is also a 
right eigenvector since the matrix is symmetric): 


PoE S143 ,2;5/8,1): 


Thus a random path (with the uniform distribution over all paths corresponding to the weights 
being equal to 1) visits nodes 1,..., 5 with frequencies proportional to 


1s “1792. Be 9732, A, 


implying that the non-extremal nodes are visited more often—such nodes have higher degrees 
of freedom, so that there tend to be more paths that traverse them. 

In fact, this example has structure. For instance, the graph ['(11) defined by an interval of 
length 10, leads to a matrix with a highly factorable characteristic polynomial 


Xean =2(z-l)(z 2) (27 2z 2) (2? 22 1) (z* Az? 4227442 2) 


The reader may have recognized a particular case of lattice paths which resort to the theory 
exposed in Section V. 3. Indeed, according to Proposition V.3, the OGF of paths from vertex 1 
to vertex 1 in the graph I'(k) with vertex set {1,...,&} is given by the continued fraction 


l-z- 


l-—z 
(The number of fraction bars is k.) From this it can be shown that the characteristic polynomial 
of G is an elementary variant of the Fibonacci-Chebyshev polynomial of Example 6, p. 303. 
The analysis based on Theorem V.8 is simpler, albeit more rudimentary, as it only provides a 
first-order asymptotic solution to the problem. 

This example is typical: in many cases combinatorial problems have some amount of 
regularity. In such situations, all the resources of linear algebra are available, including the 
vast body of knowledge gathered over years on calculations of structured determinants; see for 
instance Krattenthaler’s survey [319] and the book [483]. ....... END OF EXAMPLE V.12. 


EXAMPLE V.13. Elementary theory of finite Markov chains. Consider the case where the row 
sums of matrix G are all equal to 1, that is, )> 5 9.9 = 1. Such a matrix is called a stochastic 
matrix. The quantity g;,; can then be interpreted as the probability of leaving state 2 for state 7, 
assuming one is in state 7. Assume that the matrix G is irreducible and aperiodic. Clearly, the 
matrix G admits the column vector r = (1,1,...,1)* as a right eigenvector corresponding to 
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+10000 
110000 2 2 
eo a 
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100001 2 2 0 
100000 7900005 
100006 


FIGURE V.17. The devil’s staircase (m = 6) and the two matrices that can model it. 


the dominant eigenvalue A; = 1. The left eigenvector @ normalized so that its elements sum 
to 1 is called the (row) vector of stationary probabilities. It must be calculated by linear algebra 
and its detrmination involves finding an element of the kernel of matrix J — G, which can be 
done in a standard way. 

Application of Theorem V.8 and Equation (88) shows immediately the following: 


Proposition V.8 (Stationary probabilities of Markov chains). Consider a weighted graph cor- 
responding to a stochastic matrix G which is irreducible and aperiodic. Let ¢ be the normalized 
left eigenvector corresponding to the eigenvalue 1. A random (weighted) path of length n with 
fixed origin and destination visits node s a mean number of times asymptotic to €;n and tra- 
verses edge (s,t) a mean number of times asymptotic to £sgs,en. A random path of length n 
with fixed origin ends at vertex s with probability asymptotic to £5. 


The vector @ is also known as the vector of stationary probabilities. The first-order asymp- 
totic property expressed by Proposition V.8 certainly constitutes the most fundamental result in 
the theory of finite Markov chains. ...................020.000e END OF EXAMPLE V.13. 


EXAMPLE V.14. The devil’s staircase. This example illustrates an elementary technique 
often employed in calculations of eigenvalues and eigenvectors. It presupposes that the matrix 
to be analysed can be reduced to a sparse form and has a regular enough structure. 

You live in a house that has a staircase with m steps. You come back home a bit loaded 
and at each second, you can either succeed in climbing a step or fall back all the way down. On 
the last step, you always stumble and fall back down (Figure 17). Where are you likely to be 
found at time n? 

Precisely, two slightly different models correspond to this informally stated problem. The 
probabilistic model views it as a Markov chain with equally likely possibilities at each step 
and is reflected by matrix Gin Figure 17. The combinatorial model just assumes all possible 
evolutions (“histories”) of the system as equally likely and it corresponds to matrix G. We opt 
here for the latter, keeping in mind that the same method basically applies to both cases. 

We first write down the constraints expressing the joint properties of an eigenvalue \ and 
its right eigenvector 2 = (21,...,%m)*. The equations corresponding to (AJ — G)x = 0 are 
formed of a first batch of m — 1 relations, 


(96) (A—1)a1 — 22 = 0, ti tAve— 9273 =0, +++ ,-%1 +Atm-1 -— Fm = 0,7 


together with the additional relation (one cannot go higher than the last step): 


(97) —%1 + Atm = 0. 
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The solution to (96) is readily found by pulling out successively %2,...,%m as functions of x1: 


(98) w2=(A—1)a1, w3=(A?7-A-1)m1, «++, tm =(A™1-N™?-.---1) a1. 


Combined with the special relation (97), this last relation shows that \ must satisfy the equation 
(99) 1—2\"+A™"* =0. 


Let 1 be the largest positive root of this equation, existence and dominance being guaranteed 
by Perron-Frobenius properties. Note that the quantity 9 := 1/1 satisfies the characteristic 
equation 
1—2p+ p™t! =0, 

already encountered when discussing longest runs in words; the discussion of Example 2 then 
grants us the existence of an isolated p near 3, hence the fact that 1 is slightly less than 2. 

Similar devices yield the left eigenvector y = (y1,..., Ym). It is found easily that y; must 
be proportional to A; J. We thus obtain from Theorem V.8 and Equation (92): The probability 
of being in state j (i.e., being on step j of the stair) at time n tends to the limit 


wy = ry" 
where 21 is the root near 2 of the polynomial (99) and the normalization constant yy is deter- 
mined by >> ; 7 = 1. In other words, the distribution of the altitude at time n is a truncated 
geometric distribution with parameter 1 j. A1. For instance, m = 6 leads to A, = 1.98358, and 
the asymptotic probabilities of being in states 1,... , 6 are 
(100) 0.50413, 0.25415, 0.12812, 0.06459, 0.03256, 0.01641, 


exhibiting a clear geometric decay. Here is the simulation of a random trajectory for n = 100: 


In this case, the frequencies observed are 0.44, 0.26, 0.17, 0.08, 0.04, 0.01, pretty much in 
agreement with what is expected. 

Finally, the similarity with the longest run problem is easily explained. Let uw and d be 
letters representing steps upwards and downwards respectively. The set of paths from state 1 to 
state 1 is described by the regular expression 

Pra = (dt+ud+-.-+u™"d)*, 
corresponding to the generating function 
1 
P. SS ae 
a(2) l—zg—z2—--.— zm’ 
a variant of the OGF of words without m-runs of the letter u, which also corresponds to the 
enumeration of compositions with summands < m. The case of the probabilistic transition 
matrix G is left as an exercise to the reader. .................. END OF EXAMPLE V.14. 


EXAMPLE V.15. De Bruijn graphs. Two thieves want to break into a house whose entrance is 
protected by digital lock with an unknown four-digit code. As soon as the four digits of the code 
are typed consecutively, the gate opens. The first thief proposes to try in order all the four-digit 
sequences, resulting in as much as 40,000 key strokes in the worst-case. The second thief, who 
is a mathematician, says he can try all four-digit combinations with only 10,003 key strokes. 
What is the mathematician’s trade secret? 
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FIGURE V.18. The de Bruijn graph: (left) 2 = 3; (right) @ = 7. 


Clearly certain optimizations are possible: for instance, for an alphabet of cardinality 2 
and codes of 2 letters, the sequence 00110 is better than the naive one, 00011011, which 
is redundant; a few more attempts will lead to an optimal solution for 3—digit codes that has 
length 10 (rather than 24), for instance, 


0001110100. 


The general question is then: How far can one go and how to construct such sequences? 

Fix an alphabet of cardinality m. A sequence that contains as factors (contiguous blocks) 
all the k letter words is called a de Bruijn sequence. Clearly, its length must be at least 
5(m,k) = m* + k— 1, as it must have at least m* positions at distance at least k — 1 from the 
end. A sequence of smallest possible length 6(m, k) is called a minimal de Bruijn sequence. 
Such sequences were discovered by N. G. de Bruijn [109] in 1946, in response to a question 
coming from electrical engineering, where all possible reactions of a device presented as a black 
box must be tested at minimal cost. We shall expose here the case of a binary alphabet, m = 2, 
the generalization to m > 2 being obvious. 

Let £ = k—1and consider the automaton By that memorizes the last block of length @ read 
when scanning the input text from left to right. A state is thus assimilated to a string of length @ 
and the total number of states is 2°. The transitions are easily calculated: let ¢ € {0,1} be 
a state and let o(w) be the function that shifts all letters of a word w one position to the left, 
dropping the first letter of w in the process (thus o maps {0, 1}* to {0, 1}°~1); the transitions 
are 


0 1 
qo(q)0, = qrvo(q)l. 
If one further interprets a state q as the integer in the interval [0 . . 2° — 1] that it represents, then 
the transition matrix assumes a remarkably simple form: 


Tig = [G = 24 mod 2°) or (j = 21 + 1 mod 2°)]. 


See Figure 18 for a rendering borrowed from [215]. 

Combinatorially, the de Bruijn graph is such that each node has indegree 2 and outdegree 2. 
By a well known theorem going back to Euler: A necessary and sufficient condition for an 
undirected connected graph to have an Eulerian circuit (that is, a closed path that traverses 
each vertex exactly once) is that every node has even degree. For a strongly connected digraph, 
the condition is that each node has an outdegree equal to its indegree. This last condition is 
obviously satisfied here. Take an Eulerian circuit starting and ending at node 0°; its length is 
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2°+1 — oF Then, clearly, the sequence of edge labels encountered when prefixed with the word 
0*-+ = 0° constitutes a minimal de Bruijn sequence. In general, the argument gives a de Brujin 
sequence with minimal length m* + —1. Et voila! The trade secret of the thief-mathematician 
is exposed. 

Back to enumeration. The de Bruijn matrix is irreducible since a path labelled by suffi- 
ciently many zeros always leads any state to the state 0°, while a path ending with the letters 
of w € {0,1}* leads to state w. The matrix is aperiodic since it has a loop on states 0° and 1°. 
Thus, by Perron Frobenius properties, it has a unique dominant eigenvalue, and it is not hard to 
check that its value is 41 = 2, corresponding to the right eigenvector (1,1,..., 1)’. If one fixes 
a pattern w € {0,1}', Theorem V.8 yields back the known fact that a random word contains 
on average ~ 57 occurrences of pattern w, while Note 39 further implies that the distribution 
of the number of occurrences is concentrated around the mean, as the variance is O(n). The 
de Bruijn graph may be used to quantify many properties of occurrences of patterns in random 
words: see for instance [35, 192, 215]. 2.0.0... ... cee cee eee END OF EXAMPLE V.15. 


EXAMPLE V.16. Words with excluded patterns. Fix a finite set of patterns Q = {wi,..., wr}, 
where each w; is a word of A*. The language € = € © of words that contain no factor in Q is 
described by the extended regular expression 


E=A\ U(A*ujA*), 
j=l 

which constitutes a concise but highly ambiguous description. By closure properties of regular 
languages, € is itself regular and there must exist a deterministic automaton that recognizes it. 

An automaton recognizing € can be constructed starting from the de Bruijn automaton of 
index k = —1 + max|w,| and deleting all the vertices and edges that correspond to a word 
of Q. Precisely, vertex q is deleted whenever gq contains a factor in 2); the transition (edge) from 
q associated with letter a gets deleted whenever the word ga contains a factor in 2. The pruned 
de Bruijn automaton, call it B2, accepts all words of 0", when it is equipped with the initial 
state 0” and all states are final. Thus, the OGF E(z) is in all cases a rational function. 

The matrix of B? is the matrix of B;, with some nonzero entries replaced by 0. Assume that 
By, is irreducible. This assumption only eliminates a few pathological cases (e.g., 2 = {01} 
on the alphabet {0, 1}). Then, the matrix of Bz admits a simple Perron-Frobenius eigenvalue 
Ai. By domination properties (Q 4 0), we must have 41 < m, where m is the cardinality 
of the alphabet. Aperiodicity is automatically granted. We then get by a purely qualitative 
argument: The number of words of length n excluding patterns from the finite set Q is, under 
the assumption of irreducibility, asymptotic to crt, for some c > 0 and 1 < |Al. This gives 
us in a simple manner a strong version of what has been earlier nicknamed “Borges’s Theorem” 
(Note 32, p. 58): Almost every sufficiently long text contains all patterns of some predetermined 
length £. 

The construction of a pruned automaton is clearly a generalization of the case of words 
obeying local constraints in Example 11 above. ............... END OF EXAMPLE V. 16. 


> V.40. Walks on undirected graphs. Consider an undirected graph I’, where one moves by fol- 
lowing at each step a random edge of the graph, uniformly at random from the current position. 
Then, the transition matrix P = (p;;) of the associated Markov chain is: p:,; = 1/ deg(¢) if 
(i, 7) is an edge, where deg(z) is the degree of vertex 7. The stationary distribution is given by 
mw, = (deg(z))/(2| |), where | E| is the number of edges of I. In particular, if the graph is 
regular, the stationary distribution is uniform. (See Aldous and Fill’s forthcoming book [8] for 
(much) more.) 
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> V.41. Words with excluded patterns and digital trees. Let S' be a finite set of words. An 
automaton recognizing S, considered as a finite language, can be constructed as a tree. The tree 
obtained is akin to the classical digital tree or trie that serves as a data structure for maintaining 
dictionaries [307]. 

A modification of the construction yields an automaton of size linear in the total number of 
characters that appear in words of S. [Hint. The construction can be based on the Aho—Corasick 
automaton [4]). J 


V.6. Transfer matrix models 


There exists a cluster of applications of rational functions to problems that are nat- 
urally described as paths in digraphs, but with edges that may be of different sizes. In 
physics, such models lie at the heart of what is known as the “transfer matrix method”. 
Technically, the theory is a simple extension of the standard case of paths in graphs 
developed in the previous section to which it reduces when all edges have the same 
length. Its main interest lies in its expressiveness as regards a number of combinato- 
rial problems, including trees of bounded width, models of self-avoiding walks, and 
certain constrained permutation problems. 


V.6.1. Combinatorial aspects. The transfer matrix method constitutes a variant 
of the modelling by deterministic automata and by paths in standard graphs. The 
general framework is summarized in Figure 19. 

Usually, when setting up such a system, one has to invent a finite collection of 
properties (“states”) describing the C;, which are of the same nature as the original 
class C. The combinatorial system (102) can be visualized as a graph with the objects 
of the 9; ;, classes attached to edges (“transitions between states”) that are generally 
of different sizes. 


Definition V.7. Given a directed multigraph T with vertex set V and edge set E, a 
size function on T is any function 0 : E — Zs. A sized graph is a pair (G,o), where 
o is a size function. 


Paths are defined in the same way as in Section V.5. The length of a path is, as 

usual, the number of edges it comprises; the size of a path is defined to be the sum 
of the sizes of its edges. Like in the basic case treated in the previous section, we 
also allow edges to carry positive weights (multiplicities, probability coefficients), the 
weight of a path being the product of the weights of its edges. 
Definition V.8. A matrix T(z) is a transfer matrix if each of its entries is a polynomial 
in z with nonnegative coefficients. A transfer matrix T(z) is said to be proper if T(0) 
is nilpotent, that is, T(0)" = 0 for some r > 1. 

Examples of transfer matrices are 


‘Ch 4) (2 ete): 


and both are proper. For the graphs and automata considered in Section V.5, all edges 
were taken to be of unit size. In that case, the associated (weighted) adjacency matrices 
are invariably of the form T(z) = zS, with S a scalar matrix having nonnegative 
entries, and thus are very particular cases of proper transfer matrices. 


NIRAIE 
NIRAI0 
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Transfer Matrix Method. Let C be a combinatorial class to be enumerated. 


— Determine a collection C1,C2,...,Cm of classes, with C; = C such that the following 
system of equation holds: 
(102) C=. SS. eer ae, Stl 

kE{1,2,...,.m} 


where each (2.;,, and each Z; is a finite class. 
— The OGF C’ ss =Ci sa is then given : the solution of the linear system 


= LM x(2) z)+1;(z), g=l,...,m, 


where 22;,x(z) and I;(z) are the generating polynomials of Q,,, and Z;, respectively. ‘Accord 
ingly, C'(z) is a linear combination of entries of the quasi-inverse matrix (I — Q(z 


FIGURE V.19. A Summary of the basic Transfer Matrix Method. 


Given a sized graph [ equipped with weight function w : EF — Ryo (with 
w(e) = 1 in the pure enumerative case), we can associate to it a transfer matrix T(z) 
as follows: 

(101) Tez) = —S, ~wle)z!. 

e€ Edge(a,b) 

There, Edge(a, b) represents the set of all edges connecting a to 6; w(e) and |e| = 
a(e) represent respectively the weight and the size of edge ec. The matrix T(z) whose 
a, b-entry is the polynomial T,,»(z), as given in (101), is called the transfer matrix 
of the (weighted, sized) graph. Clearly, the transfer matrix of a sized graph is always 
proper. Since T(z)" describes all paths in the graph with z marking size, the proof 
techniques of Proposition V.6 (p. 321) immediately provide: 

Proposition V.9. Given a sized graph with associated transfer matrix T(z), the OGF 


Fs) (z) of the set of paths from i to j, where z marks size, is the entry i,j of the 
matrix (I — T(z))71: 


FEN) = (FTE) ss 


V.6.2. Analytic aspects. In order to apply the general results from the previous 
section to transfer matrices, we must first take note of an easy reduction of transfer 
matrices to the standard case of paths in graphs where all edges have size 1. 

Given a sized graph I’, one can build as follows a standard graph G where all 
edges of G have unit size. The set of vertices of G is the set of vertices of I augmented 
by additional vertices called relay nodes. For each edge e of size o(e) = min T, 
introduce m — 1 additional relay nodes and connect these in G by a simple path from 
a to b, with edges all of size 1. Here is for instance the transcription of an edge of 
length 4 in [ by means of three relay nodes in G: 


oo oe OM, 


Clearly, the vertices of I are a subset of the vertices of G andall paths of [ correspond 
to paths of G. Let T’ be the (scalar) adjacency matrix of [’. Then, the quasi-inverse 
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(I - 2T)7} describes all the paths in I’, with size taken into account, in the sense that 
the entry of index (7, 7) in this quasi-inverse is the OGF of paths from node numbered i 
to node numbered 7 in the sized graph T. 

This construction permits us to apply the main results of Section V.5 to transfer 
matrices and sized graphs. Let us say that the sized graph T and its transfer matrix 
T(z) are irreducible (respectively aperiodic) if G and T are irreducible (respectively 
aperiodic). We can then transcribe immediately Theorems V.7 and V.8 as follows. 


Corollary V.1. (i) Consider a sized graph T that is irreducible and aperiodic. Then, 
there exist a computable constant A, and numbers (p;,; such that the OGF of paths 
fromi to 7 inT satisfies 


(103) [2] F@) (z) = wigAP+O(A"), OSAK<AL. 


(it) In a random path from a to b of large size, the number of occurrences of a 
designated edge (s,t) is asymptotically 


(104) sn + O(1), 


for a computable constant Ws_t. 


Thus, on general grounds, the behaviour of paths is predictable. The notes be- 
low explore some further properties that make it possible to operate directly with the 
transfer matrix and the sized graph, without necessitating the explicit construction of 
T and G. 
> V.42. Irreducibility for sized graphs. The sized graph I is irreducible (in the sense above) if 
and only if the graph G'; where all edges of I are taken to be of size 1 is strongly connected. The 
transfer matrix T(z) of I is irreducible (in the sense above) if and only if T'(1) is irreducible in 
the usual sense of scalar transfer matrices. <q 


> V.43. Aperiodicity for sized graphs. A polynomial p(z) = >7, cj2°7, with every cj # 0, 
is said to be primitive if the quantity 6 = gcd({e,; }) is equal to 1; it is imprimitive otherwise. 
Equivalently, p(z) is imprimitive iff p(z) = q(z°) for some bona fide polynomial q and some 
6 > 1. An irreducible sized graph is aperiodic (in the sense above) if and only if at least one 
diagonal entry of some power T'(z)° is a primitive polynomial. Equivalently: there exist two 
circuits of the same length, whose sizes, 1, $2, satisfy gcd(s1, s2) = 1. <J 


> V.44. Direct determination of the asymptotic growth constant. Let T be a sized graph as- 
sumed to be irreducible and aperiodic. Then, one has A; = 1/p, where p is the smallest 
positive root of det(I — T(z)) = 0, with T(z) the transfer matrix of T. <q 


V.6.3. Applications. The quantitative properties summarized by (103) and (104) 
apply with full strength to classes that are amenable to the transfer matrix method. We 
shall first illustrate the situation by the width of trees following an early article by 
Odlyzko and Wilf [379], then continue with an example that draws its inspiration 
from the insightful exposition of domino tilings and generating functions in the book 
of Graham, Knuth, and Patashnik [248], and conclude with an exactly solvable poly- 
omino model. 


EXAMPLE V.17. Width of trees. The width of a tree is defined as the maximal number of nodes 
that can appear on any layer at a fixed distance from the root. If a tree is drawn in the plane, 
then width and height can be seen as the horizontal and vertical dimensions of the bounding 
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Zz 22 23 
Qz 327 42 
3z 627 102° 


FIGURE V.20. The sized graph corresponding to general plane trees of width at most 3 
and its transfer matrix. (For readability, the transitions from a node to itself are omitted.) 


rectangle. Also, width is an indicator of the complexity of traversing the tree in breadth-first 
search (by a queue), while height is associated to depth-first search (by a stack). 

Transfer matrices are ideally suited to the problem of analysing the number of trees of fixed 
width. Consider a simple variety of trees Y corresponding to the equation Y(z) = z(Y(z)), 
where the “generator” ¢ describes the formation of trees. Let C := Y (“l be the subclass of 
trees of width at most w. Such trees are easily built layer by layer. Indeed, with reference 
to our general description of the transfer matrix method at the beginning of the section, let 
us introduce a collection of classes C,, where each Cy, (kK = 1,...,w) comprises all trees of 
width < w having exactly k nodes at the deepest level. We then have C = )77'_, Cx (this is a 
trivial variant of the case considered in our general description). Thus the states of the transfer 
matrix model, equivalently the nodes of the size graph, correspond to the number of nodes on 
the deepest layer of the tree. The transition between configurations C; corresponding to state 
j and configurations C, corresponding to state k is effected by grafting in all possible ways a 
forest of j trees, of total height equal to 1, having k leaves. See Figure 20 for the case of width 
w= 3. 

The number of j-forests of depth 1 having & leaves is the quantity 


tie = [u"]d(y)’. 
Let T be the w x w matrix with entry T;,, = z*t;,,. Then, clearly, the quantity z‘(T”);,; 


(with 1 < i, 7 < w) is the number of 7-forests of height h and width at most w, having 7 nodes 
on level h. Thus, the GF of }y-trees having width at most w is 


yl(z) = (z,0,0,...)(2—T)~*(4,1,1,...)*. 


For instance, in the case of general Catalan trees, the matrix T has the shape, 


2) 22) (3) 28/4) 

ry} 2) 20) 2Q) a0) 
Na ey a 

z(3) 23) ae) z*(3) 


for width 4. The analysis of dominant poles provides asymptotic formulae for [z”]Y '!(z): 


w=2 w=3 w=4 w=d w=6 
0.0085 - 2.1701” 0.0026 - 2.8050” 0.0012 - 3.1638” 0.0006 - 3.3829” 0.0004 - 3.5259” 


Irreducibility is granted since all entries in the transfer matrix are nonzero. Aperiodicity derives 
from aperiodicity of the generator ¢, as verified by a simple argument (e.g., using Note 43). 
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Proposition V.10. The number of trees of width at most w in a simple family of trees satisfies 
an asymptotic estimate of the form 


yi] = cups” + O(n), 


for some computable positive constants Cw, Pw. 


In addition, the exact distribution of height in trees of size n becomes computable in poly- 
nomial time (though with a somewhat high exponent). 

The character of these generating functions has not been investigated in detail since the 
original work [379], so that, at the moment, analysis stops there. Fortunately, probability theory 
can take over. Chassaing and Marckert [82] have shown, for Cayley trees, that the width satisfies 


E,(W) = [ero (nm logn) ‘ P,(V2W < x) + 1- O(2), 


where O(x) is the Theta function defined in (57), p. 305. This answers very precisely an open 
question of Odlyzko and Wilf [379]. The distributional results of [82] extend to trees in any 
simple variety (under mild and natural analytic assumptions on the generator ¢): see the paper 
by Chassaing, Marckert, and Yor [83], which builds upon earlier results of Drmota and Gitten- 
berger [136]. In essence, the conclusion of these works is that the breadth first search traversal 
of a large tree in a simple variety gives rise to a queue whose size fluctuates asymptotically 
like a Brownian excursion, and is thus, in a strong sense, of a complexity comparable to depth- 
first search: trees taken uniformly don’t have much of a preference as to the way they may be 
TEAVETSEG Side wo Restate dines scatate araom gheys eta i gw Be enaraserece me bree nee END OF EXAMPLE V.17. 


> V.45. A question on width polynomials. It is unknown whether the following assertion is 
true. The smallest positive root px of the denominator of Y™! (z) satisfies 


for some c > 0. If such an estimate holds together with suitable companion bounds, it would 
yield a purely analytic proof of the fact that expected width of n-trees is O(,/7), as well as 
detailed probability estimates. (The classical theory of Fredholm equations may be useful in 
this context.) J 


EXAMPLE V.18. Monomer-dimer tilings of a rectangle. Suppose one is given pieces that may 
be one of the three forms: monomers (m) that are 1 x 1 squares, and dimers that are dominoes, 
either vertically (v) oriented 1 x 2, or horizontally (h) oriented 2 x 1. In how many ways can 
an n x 3 rectangle be covered completely and without overlap (‘tiled’) by such pieces? 

The pieces are thus of the following types, 


m=L) n=O) v= 


and here is a particular tiling of a 5 x 3 rectangle: 


geen 
aes 
ee ae 
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In order to approach this counting problem, one defines a class C of combinatorial objects 
called configurations. A configuration relative to an n x k rectangle is a partial tiling, such that 
all the first n — 1 columns are entirely covered by dominoes while between zero and three unit 
cells of the last column are covered. Here are for instance, configurations corresponding to the 
example above. 


Ee Li 1 
E HD bese se al 
Paes a ee eee ee 


These diagrams suggest the way configurations can be built by sickeeciiers successive addition of 
dominoes. Starting with the empty rectangle 0 x 3, one adds at each stage a collection of 
at most three dominoes in such a way that there is no overlap. This creates a configuration 
where, like in the example above, the dominoes may not be aligned in a flush-right manner. 
Continue to add successively dominoes whose left border is at abscissa 1, 2,3, etc, in a way 
that creates no internal “holes”. 

Depending on the state of filling of their last column, configuration can thus be classified 
into 8 classes that we may index in binary as Cooo,..., C111. For instance Coo1 represent con- 
figurations such that the first two cells (from top to bottom, by convention) are free, while the 
third one is occupied. Then, a set of rules describes the new type of configuration obtained, 
when the sweep line is moved one position to the right and dominoes are added. For instance, 


we have 
Co1o ‘) = Cio1. 
In this way, one can set up a grammar (resembling a deterministic finite automaton) that 
expresses all the possible constructions of longer rectangles from shorter ones according to the 
last layer added. The grammar comprises productions like 


Coon = €+mmmCooo + mvuCooo + vmCooo 
+ -mmC'100 + m:-mCo10 + mm:Coo1 + v:Coo1 + uC 100 
+mCou +2m-Cio1 +2mMCr10 + 2Cii - 


In this grammar, a “letter” like mv represent the addition of dominoes, in top to bottom order, 
of types m, v respectively; the letter m-m means adding two m-dominoes on the top and on the 
bottom, etc. 

The grammar transforms into a linear system of equations with polynomial coefficients. 
The substitution, 

2 2 
mtroz, hez, vez, 
then gives the generating functions of configurations with z marking the area covered: 
(1 — 223 — 2°)(14 23 — 2°) 

(14 28)(1 — 523 — 926 + 929 4+ 222 — zh)" 
In particular, the coefficient [z°”]Cooo(z) is the number of tilings of an n x 3 rectangle: 


Cooo(z) = 1+ 32° + 222° + 131z° + 8232"? + 50962z7° + 


Cooo(z) = 


The sequence grows like ca” (for n = 0 (mod 3)) where a = 1.83828 (a is the cube root 
of an algebraic number of degree 5). (See [81] for a computer algebra session.) On average, 
for large n, there is a fixed proportion of monomers and the distribution of monomers in a 
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random tiling of a large rectangle is asymptotically normally distributed, as results from the 
developments of Chapter IX. .......... 0.0... cece eee eee eee END OF EXAMPLE V.18. 


The tiling example is a typical illustration of the transfer matrix method as de- 
scribed at the beginning of this section (p. 340). One seeks to enumerate a “special” 
set of configurations Cy. (In the example above, this is Coo representing complete 
rectangle coverings.) One determines an extended set of configurations C (the partial 
coverings, in the example) such that: (7) C is partitioned into finitely many classes; 
(it) there is a finite set of “actions” that operate on the classes; (<i) size is affected 
in a well-defined additive way by the actions. The similarity with finite automata is 
apparent: classes play the réle of states and actions the rdle of letters. 

Often, the method of transfer matrices is used to approximate a hard combinato- 
rial problem that is not known to decompose, the approximation being by means of a 
family of models of increasing “widths”. For instance, the enumeration of the number 
T,, of tilings of an n x n square by monomers and dimers remains a famous unsolved 
problem of statistical physics. Here, transfer matrix methods may be used to solve the 
nm X w version of the monomer—dimer coverings, in principle at least, for any fixed 
width w: the result will always be a rational function, though its degree, dictated by 
the dimension of the transfer matrix, will grow exponentially with w. (The “diagonal” 
sequence of the n x w rectangular models corresponds to the square model.) It has 
been at least determined by computer search that the diagonal sequence T,, starts as 
(this is EZS A028420): 


1, 7, 131, 10012, 2810694, 2989126727, 11945257052321,.... 


From this and other numerical data, one estimates numerically that (T,,)!/ ae: 
1.94021 ..., but no expression for the constant is known to exist. The difficulty of 
coping with the finite-width models is that their complexity (as measured , e.g., by 
the number of states) blows up exponentially with w—such models are best treated 
by computer algebra; see [514]—and no law allowing to take a diagonal is visible. 
However, the finite width models have the merit of providing at least provable upper 
and lower bounds on the exponential growth rate of the hard “diagonal problem”. 

In contrast, for coverings by dimers only, a strong algebraic structure is available 
and the number of covers of an n x n square by horizontal and vertical dimers satisfies 
a beautiful formula originally discovered by Kasteleyn (n even): 


n/2n/2 


2 7 kr 
105 Up, = 2”"/? 2d 2 
(105) ITT (cs Pai ae | 


This sequence is EJS A004003, 
1, 2, 36, 6728, 12988816, 258584046368, 53060477521960000, ... . 


It is elementary to prove from (105) that 


Co 


1 (—1)” ‘ 
lim (Un)/” = = eee Pe 1 asea lc 
. im (U, ) exp (2 a. (n+ 77) € 3385 , 
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FIGURE V.21. A self-avoiding polygon or SAP (left) and a self-avoiding walk or SAW 
(right). 


where G is Catalan’s constant. This means in substance that each cell has a number of 
degrees of freedoms equivalent to 1.33851. See Percus’ monograph [391] for proofs 
of this famous result and Finch’s book [165, Sec. 5.23] for context and references. 

> V.46. Powers of Fibonacci numbers. Consider the OGFs 


1 7 n 
G(z) :-= —— = ow Fryiz, Giz) = ss (Baa)? 2"; 
n>0 n>=0 

where F;, is a Fibonacci number. The OGF of monomer-—dimer placements on a k x n board 
when only monomers (m) and horizontal dimers (h) are allowed is obviously Gl*!(z). On the 
other hand, it is possible to set up a transfer matrix model with state 7 (0 < i < k) correspond- 
ing to 7 positions of the current column occupied by a previous domino. Consequently, 

G"l(z) = coeff. (I - gr} , where 7); = ‘ ' 

; , i+jg—k 

for 0 < 1,7 < k. [The denominator of Gi*l(z) is otherwise known exactly: see [306, 
Ex. 1.2.8.30].] 


> V.47. Tours on chessboards. The OGF of Hamiltonian tours on an n x w rectangle is rational 
(one is allowed to move from any cell to any other vertically or horizontally adjacent cell). The 
same holds for king’s tours and knight’s tours. <q 


> V.48. Cover time of graphs. Given a fixed digraph [ assumed to be strongly connected, and 
a designated start vertex, one travels at random, moving at each time to any neighbour of the 
current vertex, making choices with equal likelihood. The expectation of the time to visit all the 
vertices is a rational number that is effectively (though perhaps not efficiently!) computable. 
({Hint: set up a transfer matrix, a state of which is a subset of vertices representing those vertices 
that have been already visited. For an interval [0, .. mJ], this can be treated by the dedicated 
theory of walks on the integer interval, as in Section V. 3; for the complete graph, this is equiva- 
lent to the coupon collector problem. Most other cases are “hard” to solve analytically and one 
has to resort to probabilistic approximations; see Aldous and Fill’s forthcoming book [8] for a 
probabilistic approach. ] <q 


EXAMPLE V.19. Self-avoiding walks and polygons. A long standing open problem shared by 
statistical physics, combinatorics, and probability theory alike is that of quantifying properties 
of self-avoiding configurations on the square lattice (Figure 21). Here we consider objects that, 
starting from the origin (the “root’’), follow a path, and are solely composed of horizontal and 
vertical steps of length +1. The self-avoiding walk or SAW can wander but is subject to the 
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condition that it never crosses nor touches itself. The self-avoiding polygons or SAPs, whose 
class is denoted by ?, are self-avoiding walks, with only an exception at the end, where the end- 
point must coincide with the origin. We shall focus here on polygons. It proves convenient also 
to consider unrooted polygons (also called simply-connected polyominoes), which are polygons 
where the origin is discarded, so that they plainly represent the possible shapes of SAPs up to 
translation. For length 2n, the number p,, of unrooted polygons satisfies pn = P,/(4n) since 
the origin (2n possibilities) and the starting vertex (2 possibilities) of the corresponding SAPs 
are disregarded in that case. Here is a table, for small values of n, listing polyominoes and the 
corresponding counting sequences pn, Pn. 


Tes 
Wont 


ni 2” «3 4 5 6 7 8 9 10 
Pn (EIS A002931): 1 2 7 28 124 588 2938 15268 81826 
Py (EIS A010566): 8 24 112 560 2976 16464 94016 549648 3273040 


Take the (widely open) problem of determining the number P,, of SAPs of perimeter 2n. 
This (intractable) problem can be approached as a limit of the (tractable) problem" that con- 
sists in enumerating the collection Pll of SAPs of width w, for increasing values of w. The 
latter problem is amenable to the transfer matrix method, as first discovered by Enting in 1980; 
see [152]. Indeed, take a polygon and consider a sweepline that moves from its left to its right. 
Once width is fixed, there are at most 2?"’t? possibilities for the ways a vertical sweepline 
may intersect the polygon’s edges at half integer abscisse. (There are w + 1 edges and for 
each of these, one should “remember” whether they connect with the upper or lower boundary.) 
The transitions are then themselves finitely described. In this way, it becomes possible to set 


7 


A 


rc ~ 


a 


up a transfer matrix for any fixed width w. For fixed n, by computing values of Pl! with 
increasing w, one finally determines (in principle) the exact value of any P,,. 

The program suggested above has been carried out to record values by the “Melbourne 
School” under the impulse of Tony Guttmann. For instance, Jensen [284] found in 2003 that 
the number of unrooted polygons of perimeter 100 is 


ps0 = 7545649677448506970646886033356862162. 


Attaining such record values necessitates algorithms that are much more sophisticated than the 
naive approach we have just described, as well as a number of highly ingenious programming 
optimizations. 

It is an equally open problem to estimate asymptotically the number of SAPs of perime- 
ter n. Given the exact values till perimeter 100 or more, a battery of fitting tests for asymptotic 
formula can be applied, leading to highly convincing (though still heuristic) formula. Thanks 
to several workers in this area, we can regard the final answer as “known”. From the works of 
Jensen and his predecessors, it results that a reliable empirical estimate is of the form 


Pn = By?” (2n)~°(1 + o(1)), 
j= 2.63815 85303, 6 = -3 +3-10-7, B=+0.5623013. 


'4Tp this version of the text, we limit ourselves to a succinct description and refer to the original 
papers [152, 284] for details. 
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FIGURE V.22. Five horizontally convex polyominoes of size n = 50 drawn uniformly 
at random. 


Thus, the answer is almost certainly of the form p, =< pern>/ 2 


for unrooted polygons and 
Ph xX pern-3/ ? for rooted polygons. It is believed that the same connective constant ju dictates 
the exponential growth rate of self-avoiding walks. See Finch’s book [165, Sec. 5.10] for a 
perspective and numerous references. 

There is also great interest in the number pm,» of polyominoes with perimeter 2n and 
area m, with area defined as the number of square cells composing the polyomino. Studies 
conducted by the Melbourne school yield numerical data that are consistent to an amazing 
degree (e.g., moments till order ten and small—n corrections are considered) with the following 
assumption: The distribution of area in a fixed-perimeter polyomino obeys in the asymptotic 
limit an “Airy area distribution”. This distribution is defined as the limit distribution of the 
area under Dyck paths, a problem that was briefly discussed on p. 307 and to which we propose 
to return in Chapter IX. See [284, 411] and references therein for a discussion of polyomino 
area. It is finally of great interest to note that the interpretation of data was strongly guided by 
what is already known for exactly solvable models of the type we are repeatedly considering in 
this:bOok:- ¢ ved oe ceaaehet als Seer na okkl ts Seg bob eene END OF EXAMPLE V.19. 


EXAMPLE V.20. Horizontally convex polyominoes. Pélya [396] and Temperley [466] inde- 
pendently discovered an exactly solvable polyomino model. (See also the text by van Rens- 
burg [482] for more.) Define as usual a polyomino as a collection of unit squares with vertices 
in Z>0 X Z>o that forms a connected set without articulation points. Such a polyomino is said 
to be horizontally convex (H.C.) if its intersection with any horizontal line is either empty or 
an interval. An H.C. polyomino is thus a stack of a certain number of rows of squares, where 
each row has a segment of length > 1 in common with the next row up. (We imagine H.C. 
polyominoes growing from bottom to top.) The enumeration of such polyominoes, following 
Temperley [466, p. 66] constitutes a nice illustration of the transfer matrix method in the case 
when the set of states is infinite. 

Let T|*! be the class of polyominoes with exactly k square cells on their top row. Size of 
a polyomino is its number of cells. We wish to enumerate the class T := (), T!*l, In order to 
do so, according to the transfer matrix method, one needs to relate the T'*] to one another. Let 
z be the variable marking size and let x mark the size of the top row. The transition from one 
T") toaT has a multiplicity equal to k + ¢—1. Thus the generating functions t, := Tir (z) 


350 V. APPLICATIONS OF RATIONAL AND MEROMORPHIC ASYMPTOTICS 


satisfy the infinite system of equations 


ty = z+2z (ti + 2te + 3t3 + ---) 
(106) to = 2°42? (Qt +3t2+4t3+---) 
t = 2423 (3t1 + 4t2 + 5t3 + ---) 


This corresponds to an infinite transfer matrix which is highly structured: 
M(z)ee = (k+£-1)z’, 


and, as shown by Temperley [466, p. 66], the system can be solved by elementary manipula- 
tions. 
In a case like this, it is well worth trying a bivariate generating function. Define 


T(z,u) = sS Tl (z)uk, 


The action of “adding a slice” on the top row of a polyomino is reflected by a linear operator 
L£ that transforms u” representing the top row of the polyomino before addition into a sum of 
monomials u‘z* with the proper multplicites: 
Clu®) — kluz)® + (k +1 Rade oe ee 4) 
fu] = k(ua)* + (b+ Dua) 4 = (kD e+ 
A better formula results if one expresses more generally the quantity L[f(u)]: 


(107) £UF(u)] = -@—— F) + <= (F'() - FQ). 


(1 — uz)? 1- 
Treat now the BGF T(z, w) asa function of u, keeping z as a parameter, and write for readability 
T(u) := T(z,u). A horizontally convex polyomino is obtained by starting from a bottom row 
that can have any number of cells and repeatedly adding a slice!. This construction is thus 
reflected by the main functional equation 

ZU 


(108) ZU ZU eu 


/ 
= — + — 7 (1) + — —7(1 
laa re )+ Gaye” ) 
upon making use of (107). Instantiating at uw = 1 provides the first relation 


2 
z z 
57(1), 


Zz 

109 1)= (1). 
a) Gg ae” Tt Ge 
while differentation of (108) with respect to wu followed by the specialization u = 1 provides 
the second relation 


(110) (j= 


z z 
d-2P 0-2 
We now have a linear system of two equations in two unknowns, resulting in an expression of 
7(1) = T(z) = T(z, 1), which enumerates all horizontally convex polyominoes: 

_ z(1—2z)3 

1-524 722 — 423° 

From (108) to (111), the whole calculation is barely three lines of code under a decent computer 
algebra system. Note that, the original system being infinite, it is far from obvious a priori that 
the generating function should be rational. (In the present context, rationality devolves from the 
very regular structure of the transfer matrix.) 


z 


(111) T(z) 


An earlier instance of the technique of “adding a slice” appears in the context of constrained com- 
positions, Example TII.21, p. 187. 
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The counting sequence obtained by expansion, 
T(z) =z4+227 4627419244 612° + 196 2° 4+ 629 2” + 20172 + --- 


is EIS A001169 (“Number of board-pile polyominoes with n cells”). The asymptotic form is 
also easily obtained: we find 


Tn ~ CA", C = 0.18091, A = 3.20556, 


with A a cubic irrational. 

An alternative derivation, which is more sophisticated, is due to Klarner and is presented 
in Stanley’s book [447, $4.7]. Hickerson [267] has found a direct construction, which explains 
the rationality of the GF by means of a regular language encoding. (The drawings of Figure 22 
have been obtained by an application of the recursive method [216] to Hickerson’s specifica- 
tion.) Louchard [343] has conducted an in-depth study of probabilistic properties of several 


parameters of H.C. polyominoes, using generating functions. ... END OF EXAMPLE V.20. 


> V.49. Height of H.C. polyominoes. It is possible to introduce an extra variable v to encode 
height. It is found that height grows on average linearly with n and that the distribution of height 
is concentrated [343]. (This explains the skinny aspects of polyominoes drawn in Figure 22.) 


<J 


> V.50. A transfer matrix model for lattice paths. Consider the general context of weighted 
lattice paths in Section V. 3. Let a;, 3;,y; be the weights of ascents, descents, and level steps 
repsectively, when the starting altitude is 7. The infinite transfer matrix, 


Yo ao 0 0 0 
fort Dai a1 0 0 
T= 0 Bo y2 a2 0 Para} ; 


which has a tridiagonal form, “generates” all lattice paths via the quasi-inverse (I — zT’)~'. 
In particular, any exactly solvable weighted lattice path model is equivalent to an explicit struc- 
tured matrix inversion. dq 


V.6.4. Value-constrained permutations. We conclude this chapter with a dis- 
cussion of a construction that combines transfer matrix methods with an inclusion- 
exclusion argument. We treat a collection of constrained permutation problems whose 
origin lies in nineteenth century recreational mathematics. For instance, the ménage 
problem solved and popularized by Edouard Lucas in 1891, see [98], has the following 
quaint formulation: What is the number of possible ways one can arrange n married 
couples (‘ménages’) around a table in such a way that men and women alternate, but 
no woman sits next to her husband? 

The ménage problem is equivalent to a permutation enumeration problem. Sit 
first conventionally the men at places numbered 1, 2,..., and the wives at positions 
3, 3, 2g t+ 4. Let a; be such that the 2th wife is placed at 0; + 4. Then, a ménage 
placement imposes the conditions 0; £ i and 0; #4 i+ 1 for each 7. We consider here 
a linearly arranged table (see remarks at the end for the other classical formulation 
that considers a round table), so that the condition 0; 4 i + 1 becomes vacuous when 
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4 =n. Here is a ménage placement for n = 6 and its corresponding permutation 


ie ee ee 
OS a) 5 Be Be Ae 8 


Clearly, this is a generalization of the derangement problem (for which only the 
weaker condition o; 4 7 is imposed), where the cycle decomposition of permutations 
suffices to provide a direct solution (see Example II.14, p. 113). 


Definition V.9. Given a permutation 0 = 01 -++-On, any quantity 0; — 7 is called an 
exceedance of o. Given a finite set of integers Q. C Z>0, a permutation is said to be 
Q-avoiding if none of its exceedances lies in Q. 


Inclusion-exclusion. The set Q being fixed, consider first for all 7 the class of 
augmented permutations P,, ; that are permutations of size n such that 7 of the po- 
sitions are distinguished and the corresponding exceedances lie in Q, the remaining 
positions having arbitrary values (but with the permutation property being satisfied!). 
Loosely speaking, the objects in P,,; can be regarded as permutations with “at least” 
j exceedances in 2. For instance, with Q = {1} and 


if LOS A BOT BD 
em eee Oe A ga a HM 


there are 5 exceedances that lie in 2 (at positions 1,2,3,5,6) and with 3 of these 
distinguished (say by enclosing them in a box), one obtains an element counted by 


P9,3 like 
2[3][4]86[7]159. 


Let P,,,; be the cardinality of P,,,;. We claim that the number Q,, = Qe of Q-avoiding 
permutations of size n satisfies 


(112) Qn = Yo(-1)) Pay. 


Equation (112) is typically an inclusion-exclusion relation. To prove it formally’®, 
define the number R,, ;, of permutations that have exactly k exceedances in 2 and the 
generating polynomials 


P,(w) = S- Py jw , R,(w) = S- Rn pw". 
J k 


The GF’s are related by 
P,(w) = Ry(w + 1) or Ry(w) = Pr(w — 1).. 


l6coe also the discussion in Subsection III. 7.4, p. 195. 
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ra 


FIGURE V.23. A graphical rendering of the legal template 20?02?11? relative to = 
{0, 1,2}. 


(The relation P,,(w) = R,(w + 1) simply expresses symbolically the fact that each 
Q-exceedance in R may or may not be taken in when composing an element of P.) In 
particular, we have P,(—1) = R,(0) = Rn,o = Qn as was to be proved. 


Transfer matrix model. The preceding discussion shows that everything relies on 
the enumeration P,,,; of permutations with distinguished exceedances in 22. Introduce 
the alphabet A = 2 U {‘?’}, where the symbol ‘?’ is called the ‘don’t-care symbol’. 
A word on A, an instance with Q = {0, 1,2} being 202027117, is called a template. 
To an augmented permutation, one associates a template as follows: each exceedance 
that is not distinguished is represented by a don’t care symbol; each distinguished 
exceedance (thereby an exceedance with value in (2) is represented by its value. A 
template is said to be legal if it arises from an augmented permutation. For instance a 
template 2 1--- cannot be legal since the corresponding constraints, namely 0; — 1 = 
2, 02 — 2 = 1, are incompatible with the permutation structure (one would have 
01 = 02 = 3). In contrast, the template 20702711? is seen to be legal. Figure 23 is 
a graphical rendering; there, letters of templates are represented by dominoes, with a 
cross at the position of a numeric value in Q, and with the domino being blank in the 
case of a don’t-care symbol. 

Let T,,,; be the set of legal templates relative to that have length n and comprise 
j don’t care symbols. Any such legal template is associated to exactly 7! permutations, 
since n — 7 position-value pairs are fixed in the permutation, while the 7 remaining 
positions and values can be taken arbitrarily. There results that 

n 
(113) Riga eid On Seley, 
j=0 
by (112). Thus, the enumeration of avoiding permutations rests entirely on the enu- 
meration of legal templates. 

The enumeration of legal templates is finally effected by means of a transfer ma- 
trix method, or equivalently, by a finite automaton. If a template 7 = 7, - -- 7, is legal, 
then the following condition is met, 


(114) TFAIAT +4, 
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for all pairs (2, 7) such that i < 7 and neither of 7;, 7; is the don’t-care symbol. (There 
are additional conditions to characterize templates fully, but these only concern a few 
letters at the end of templates and we may ignore them in this discussion.) In other 
words, a 7; with a numerical value preempts the value 7; + 7. Figure 23 exemplifies 
the situation in the case Q = {0,1,2}. The dominoes are shifted one position each 
time (since it is the value of o — 7 that is represented) and the compatibility con- 
straint (114) is that no two crosses should be vertically aligned. More precisely the 
constraints (114) are recognized by a deterministic finite automaton whose states are 
indexed by subsets of {0,...,b— 1} where the “span” b is defined as b = maxwyeqw. 
The initial state is the one associated with the empty set (no constraint is present ini- 
tially), the transitions are of the form (j € {0,..., b}): 


{ (qs,j) + qs: where S’ = ((S —1)U{j-—1}) N{0,...,b-1} 
(qs, ?) 2 qs where S’ = (S —1){0,...,b— 1}. 


The initial state (is q,} and it is equal to the final state (this translates the fact that 
no domino can protrude from the right, and is implied by the linear character of the 
ménage problem under consideration). In essence, the automaton only needs a finite 
memory since the dominoes slide along the diagonal and, accordingly, constraints 
older than the span can be forgotten. Notice that the complexity of the automaton, as 
measured by its number of states, is 2°. 

Here are the automata corresponding to Q = {0} (derangements) and to Q = 
{0, 1} (ménages). 


ant 
= YO 


For the ménage problem, there are two states depending on whether or not the cur- 
rently examined value has been preempted at the preceding step. 

From the automaton construction, the bivariate GF T(z, uw) of legal templates, 
with wu marking the position of don’t care symbols, is a rational function that can 
be determined in an automatic fashion from 9. For the derangement and ménage 
problems, one finds 


1 
1—2z(1+u)’ 


1l-z 


TO} = —_—_—_—... 
au) 1—2(2+4u)+ 2? 


TOU (zu) = 


In general, this gives access to the OGF of the corresponding permutations. Consider 
the partial expansion of T(z, wu) with respect to u, taken under the form 


(115) T%(2,u) = Soo), 


1 — uu,(z) 
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assuming for simplicity only simple poles. There the sum is finite and it involves 
algebraic functions c, and u, of the variable z. Finally, the OGF of (Q-avoiding per- 
mutations is obtained from T® by the transformation 

zu we (—z)"kl, 


which is the transcription of (113). Define the (divergent) OGF of all permutations, 
F(y) = So nly” = Fol, yl, 
n=0 


in the terminology of hypergeometric functions. Then, by the remarks above and (115), 
we find 
Q(z) =D) er(—2) F(-u,(-2)). 
In other words, the OGF of Q-avoiding permutations is a composition of the OGF of 
the factorial series with algebraic functions. 
The expressions simplify much in the case of ménages and derangements where 


the denominators of T are of degree 1 in u. One has 
1 z 
{0} = FPF 
aed 2 ease reer praes 


for derangements, whence a new derivation of the known formula, 


Qi} = eG @ (n — kyl. 


k=0 


y= 1+ 27 +229 + O24 +.4427° + 2652" + 185427 +--+, 


Similarly, for (linear) ménage placements, one finds 
1 z 
{0.13 (2) = —— F(——_,) = 14 29 + 324 + 162° + 962° + 67527 +--- 
Q (z) ies (ia?) + 2° + 32° 4+ 162° + 962° + 675z2'° +--+, 


which is EJS 400027 and corresponds to the formula 


OOe = s(-0 Gr i (n—k)!. 


k=0 
Finally, the same techniques adapts to constraints that “wrap around”, that is, con- 
straints taken modulo n. (This corresponds to a round table in the ménage problem.) 
In that case, what should be considered is the loops in the automaton recognizing tem- 
plates (see also the discussion of the zeta function of graphs, p. 321). One finds in this 
way the OGF of the circular (i.e., classical) ménage problem to be EJS A000179, 
QU (z) = ae aaa Qe = 14+2t294+22441325+4+8029+579274---, 


which yields the classical solution of the (circular) ménage problem, 


OiO0 — se) — - a ) yea 


k=0 


_1l-2z z 


a formula that is due to Touchard; see [98, p. 185] for pointers to the vast classical 
literature on the subject. The algebraic part of the treatment above is close to the 
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inspiring discussion found in Stanley’s book [447]. An application to robustness of 
interconnections in random graphs is presented in [190]. 


Asymptotic analysis. For asymptotic analysis purposes, the following general 
property proves useful: Let F' be the OGF of factorial numbers and assume that y(z) 
is analytic at the origin where it satisfies y(z) = z — z* + O(z°); then the following 
estimate holds: 


(116) [2”|F (y(z)) ~ [2"|F(2(1 — Az)) ~ le. 


(The proof results from simple manipulations of divergent series in the style of [29].) 
This gives at sight the estimates 


QO w ne, — QO ne 
More generally, for any set 2 containing \ elements, one has 
Qies ~ ne. 


Furthermore, the number Re , of permutations having exactly k occurrences (k fixed) 
of an exceedance in 2 is asymptotic to 
_ ae 

kl 
In other words, the rare event that an exceedance belongs to 2 obeys a Poisson distri- 
bution with A = |Q|. These last two results are established by means of probabilistic 
techniques in the book [23, Sec. 4.3]. The relation (116) provides a way of arriving at 
such estimates by purely analytic-combinatorial techniques. 


> V.51. Other constrained permutations. Given a permutation 0 = 01 --- On, a succession gap 
is defined as any difference 0:41 — o;. Discuss the counting of permutations whose succession 
gaps are constrained to lie outside of a finite set 2. In how many ways can a kangaroo pass 
through all points of the integer interval [1, n] starting at 1 and ending at n while making hops 
that belong to {—2, —1, 1, 2}? dq 


Qie ~ ne 


V.7. Perspective 


The theorems in this chapter demonstrate the power of the fundamental tech- 
niques developed in Chapter IV, which exploit classical theorems in complex analysis 
to develop coefficient asymptotics. As we start seeing it here, this approach applies 
to many of the generating functions derived from the formal combinatorial techniques 
of Part A of this book. By paying careful attention to the types of combinatorial con- 
structions involved, we are able to identify abstract schemas that help us solve whole 
classes of problems at once. Each schema connects a type of combinatorial construc- 
tion to a complex asymptotic method. In this way, it becomes possible to discuss 
properties shared by an infinite collection of combinatorial classes. In this chapter, 
we have presented the method in detail for classes that involve a sequence construc- 
tion and classes recursively defined by a linear system of equations (paths in graphs, 
automata, transfer matrices). 

In an ideal world, we might wish to have a direct correspondence between com- 
binatorial constructions and analytic methods—a theory that would carry all the way 
from combinatorial objects of any description to full analysis of all their properties. 
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The case of paths in graphs and automata, with its strong connectedness condition 
leading to Perron-Frobenius theory, is an instance of this ideal situation. Reality is 
however usually a bit more complex: theorems for deriving asymptotic results from 
combinatorial specifications must often have some sort of analytic side conditions. 
A typical example is the radius of convergence condition for supercritical sequences. 
As soon as such conditions are satisfied, the asymptotic properties of large structures 
become highly predictable. This is the very essence of analytic combinatorics. 

In the next two chapters, we investigate generating functions whose singularities 
are no longer poles—fractional exponents and logarithmic factors become allowed. 
This first necessitates investing in general methodology, a task undertaken in Chap- 
ter VI where the method known as singularity analysis is developed. Then, a chapter 
parallel to the present one, Chapter VII, will present a number of new schemas based 
on the set and cyle constructions, as well as on recursion. 


Applications of rational functions in discrete and continuous mathematics are in abun- 
dance. Many examples are to be found in Goulden and Jackson’s book [244]. Stanley [447] 
even devotes a full chapter of his book Enumerative Combinatorics, vol. I, to rational generating 
functions. These two books push the theory further than we can do here, but the corresponding 
asymptotic aspects which we expose lie outside of their scope. The analytic theory of posi- 
tive rational functions starts with the works of Perron and Frobenius at the beginning of the 
twentieth century and is explained in books on matrix theory likes those of Bellman [26] and 
Gantmacher [225]. Its importance has been long recognized in the theory of finite Markov 
chains, so that the basic theory of positive matrices is well developed in many elementary trea- 
tises on probability theory. For such aspects, we refer for instance to the classic presentations 
by Feller [161] or Karlin and Taylor [290]. 

The supercritical sequence schema is the first in a list of abstract schemas that neatly exem- 
plify the interplay between combinatorial, analytic, and probabilistic properties of large random 
structures. The origins of this approach are to be traced to early works of Bender [28, 29] fol- 
lowed by Soria and Flajolet [210, 212, 443]. 

Turning to more specific topics, we mention in relation to Section V. 3 the first global at- 
tempt at a combinatorial theory of continued fractions by Flajolet in [168] together with related 
works of Jackson of which an exposition is to be found in [244, Ch. 5] and a synthesis in [188] 
in relation to birth and death processes. Walks on graphs from an algebraic standpoint are well 
discussed in Godsil’s book [238]; for infinite graphs and groups, see Woess [500]. The discus- 
sion of local constraints in permutations based on [190] combines some of the combinatorial 
elements bound in Stanley’s book [447] with the general philosophy of analytic combinatorics. 
Our treatment of words and languages largely draws its inspiration from the line of research 
started by Schiitzenberger in the early 1960’s and on the subsequent account to be found in 
Lothaire’s book [337]. A nice review of transfer matrix methods (including a discussion of 
limit distributions) is offered by Bender, Richmond, and Williamson in [37]. 


VI 


Singularity Analysis of Generating 
Functions 


Es ist eine Tatsache, daB die genauere Kenntnis 
des Verhaltens einer analytischen Funktion 

in der Nahe ihrer singuléren Stellen 

eine Quelle von arithmetischen Satzen ist.! 


— ERICH HECKE [264, Kap. VIII] 
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A function’s singularities are reflected in the function’s coefficients. Chapters IV 
and V have treated in detail rational fractions and meromorphic functions, where the 
local analysis of polar singularities provides contributions to coefficients in the form 
of products of polynomials and simple exponentials. In this chapter, we present a 
general approach to the analysis of coefficients of generating functions that is not re- 
stricted to polar singularities and extends to a very large class of functions that have 
moderate growth or decay at their dominant singularities. The basic principle behind 
this extension is the existence of a general correspondence between 


the asymptotic expansion of a function near its dominant singularities 
and 
the asymptotic expansion of the function’s coefficients. 
This mapping essentially preserves orders of growth in the sense that larger functions 
tend to have have larger coefficients. It extends considerably the analysis of mero- 
morphic functions in Chapters ['V—V and further justifies the Principles of Coefficient 
Asymptotics enounced in Chapter IV, p. 215. 


len isa fact that the precise knowledge of the behaviour of an analytic function in the vicinity of its 
singular points is a source of arithmetic properties.” 
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Precisely, the method of singularity analysis applies to functions whose singular 
expansions involve fractional powers and logarithms—we refer to such singularities 
as “algebraic—logarithmic’ . It principally relies on two types of results. 


— First, it is possible to set up a catalogue of asymptotic expansions for coef- 
ficients of the standard functions that occur in such singular expansions 

— Second, transfer theorems allow us to extract the asymptotic order of coef- 
ficients of error terms from singular expansions with error terms. 


The developments are based on Cauchy’s coefficient formula, used in conjunction 
with special contours of integration known as Hankel contours. The contours come 
very close to the singularities then steer away: By design, they have the property of 
capturing essential asymptotic informations contained in the functions’ singularities. 

The method of singularity analysis is robust, so that functions amenable to it ben- 
efit of being closed under a variety of operations, including sum, product, integration, 
differentiation, and composition. Another important feature of the method is that it 
only necessitates local asymptotic properties of the function to be analysed. In this 
way, it often proves instrumental in the case of functions that are only indirectly ac- 
cessible through functional equations. 

This chapter is meant to develop the basic technology of singularity analysis and, 
like Chapter IV, it is largely of a methodological nature. We illustrate the approach by 
a few combinatorial problems, including simple varieties of trees (e.g, unary-binary 
trees), combinatorial sums, the supercritical cycle construction, supertrees, Pdlya’s 
drunkard walks, and tree recurrences. The next chapter, Chapter VII, will systemat- 
ically explore combinatorial structures and schemas as well as functional equations 
that can be asymptotically analysed by means of singularity analysis in a way that 
parallels Chapter V regarding meromorphic asymptotics. 


VI.1. A glimpse of basic singularity analysis theory 


Rational and meromorphic functions involve locally near a singularity elements of 
the form (1—z/w)~*. Accordingly their coefficients involve asymptotically exponen- 
tial polynomials, that is, finite linear combinations of elements of the type w~"n*—1, 
with k a positive integer. We examine here an approach that takes into account func- 
tions whose singularities are of a richer nature than mere poles found in rational and 
meromorphic functions. the method, called singularity analysis, applies to functions 
whose expansion at a singularity w involves elements of the form 


B 
zZ\7o 1 
1- =) lo ; 
( Ww ( 2 1- z) 
Under suitable conditions to be discussed in detail in this chapter, any such element 
contributes a term of the form 


wn?! (log n)?. 


Here, a and (@ can be arbitrary real (or even complex) numbers. 
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Location of singularities and exponential factors. The exponential factor w~” 
present in earlier expansions is easily accounted for (see Chapter IV), as the location 
of the dominant singularities always induces a multiplicative exponential factor for 
coefficients. Indeed, if f(z) is singular at z = w, then g(z) = f(z/w) satisfies, by the 
scaling rule of Taylor expansions, 


[2] F(a) =" lz"] (=) = ole" 92), 


and g(z) itself is singular on the unit circle, but not inside the disc. Consequently, in 
most of the discussion that follows, we shall examine functions f(z) that are singular 
at z = 1, a condition that entails no loss of generality. 

Basic scale. Consider the following table of commonly encountered functions 
that are singular at 1, together with their coefficients: 


Function Coefficient (exact) Coefficient (asymptotic) 


1 
(fa) "be = Mn ~ logn 


1 ei 
n — 

(fs) [z (cee = ntl ~ on 
Some structure is apparent in this table: a logarithmic factor in the function is reflected 
by a similar factor in the coefficients, square-roots somehow induce square-roots, and 
functions involving larger powers have larger coefficients. 

It is easy to come up at least with a partial explanation of these observations. 
Regarding basic functions such as 1, fo, f3, and fs, the Newton expansion 


(a> (aes 


n=0 


when specialized to an integer / immediately gives the asymptotic form of the coeffi- 
cients involved, 


os +1)(n+2)---(n+k-1) le 1 
Q) [r\-2) em ma |) 
For general a, it is therefore natural to expect 
n+a-1 not 1 
n 1 hear = —— 1 = : 
3) [e"\(1 = 2) ( aoa ) aaa ( +0(2)) 
It turns out that this asymptotic formula is valid for real or complex a, provided we 


interpret (a — 1)! suitably. We shall prove the estimate (see Section VI. 2 and Theo- 
rem VI.1) 


(4) end 2ye~ D4 SRD...) 
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fs fa fs f5.n 


FIGURE VI.1. The five functions from Eq. (1) and a plot of their coefficient sequences 
illustrate the tendency of coefficient extraction to be consistent with orders of growth of 
functions. 


where I'(a) is the Euler Gamma function defined as 
(5) [(a) := i et dt, 
0 


for R(a@) > 0, which coincides with (a — 1)! whenever a is an integer. (Basic prop- 
erties of this function are recalled in APPENDIX B: Gamma function, p. 689.) 

We observe from the pair (2)—(3) that functions that are larger at the singularity 
z = Lhave larger coefficients (see Figure 1). The correspondence that this observation 
suggests is very general as we are going to see repeatedly throughout this chapter. A 
catalogue of exact or asymptotic forms for coefficients of standard singular functions 
is obtained in Section VI. 2 (see Theorem VI.1). 


Transfer of error terms. An asymptotic expansion of a function f(z) that is sin- 
gular at z = 1 is typically of the form 


(6) f(z) = o(z) + O(r(z)) where o(z) > T(z) as z — 1, 


with o and 7 belonging to an asymptotic scale of standard functions like the collec- 
tion {(1 — z)~*}aer in simpler cases. Taking formally Taylor coefficients in the 
expansion (6), we arrive at 


(7) fn = [2")F (2) = [2"o(2) + [2"]O(r (2). 

The term [z”]o(z) is described asymptotically by (4). Therefore, in order to extract 
asymptotic informations on the coefficients of f(z), one needs a way of extracting 
coefficients of functions known only by their order of growth around the singularity. 


Such a translation of error terms from functions to coefficients is achieved by transfer 
theorems, which, under conditions of analytic continuation, guarantee that 


[2"]O(r(z)) = O([2"Ir(2)). 


(See Section VI.3 and Theorem VI.3.) This relation is much less trivial than its sym- 
bolic form would seem to imply. 
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In summary, it is the goal of this chapter to expose the (favorable) conditions un- 
der which we have available the correspondence (cf. Section VI. 4 and Theorem VI.4) 


(8) f(z) = o(z) + O(7(2)) = fn = On + O(T). 


This process of singularity analysis is then seen to parallel the analysis of coefficients 

of rational and meromorphic functions presented in the previous two chapters. We 

describe the method for functions from the scale 
1 18 


Ga aes rs) (2-*1), 


whose coefficients have subexponential factors of the form 
O(n) = n°"! (logn)’. 


The range of singular behaviours taken into account by singularity analysis is in fact 
considerably larger: iterated logarithms (log log’s) and more exotic functions can be 
encapsulated in the method. 


EXAMPLE VI.1. First asymptotics of 2-regular graphs. As an illustration of the modus 
operandi of singularity analysis, consider the function 
e727 27/2 


f(2 = ie 


which is the EGF of 2-regular graphs (or equivalently, “clouds”, see Note 1.21, p. 124). Sin- 
gularity analysis permits us to reason as follows. The function f(z) is only singular at z = 1 
where it has a branch point. Expanding the numerator around z = 1, we have 
3/4 
e 


(9) f= ras + O(a 2"). 


Therefore (see Theorems VI.1 and VI.3, as well as the discussion in Example VI.2 below, 
p. 378), upon translating formally and term-by-term, one has 


(0) fa" ]f(z) = e794 (" ei) i +0 (" a ’ = — + O(n-*/?), 


Furthermore, a full asymptotic expansion into descending powers of n can be obtained in the 
© 2 
~#/2-2°/4 | END OF EXAMPLE VI.1. 


same way from a full expansion of the numerator e 


Plan of this chapter. The first part of this chapter, Sections VI. 2—VI.5, is dedi- 
cated to the basic technology of singularity analysis along the lines of our foregoing 
discussion, and including the case of functions with finitely many singularities on the 
boundary of their disc of convergence. An “Intermezzo”, Section VI. 6, serves a pre- 
lude to the second part of the chapter, where we investigate operations on generating 
functions whose effect on singularities is predictable. The most important of these is 
inversion, which, under a broad set of conditions, leads to square-root singularity and 
provides a unified asymptotic theory of simple varieties of trees (Section VI. 7). Poly- 
logarithms are proved to be amenabe to singularity analysis in Section VI. 8, a fact that 
permits us to take into account weights like \/n or log n in combinatorial sums. Com- 
position of functions is studied in Section VI. 9. Then Section VI. 10 presents several 
closure properties of functions of singularity analysis class, including differentiation, 
integration, and Hadamard product. The chapter concludes with a brief presentation 
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of two classical alternatives to singularity analysis, Tauberian theory and Darboux’s 
method. 


VI. 2. Coefficient asymptotics for the basic scale 


This section and the next two present the fundamentals of singularity analysis, a 
theory which was developed by Flajolet and Odlyzko in [199]. Technically the theory 
relies on a systematic use of Hankel contours in Cauchy coefficient integrals. Hankel 
contours classically serve to express the Gamma function: see APPENDIX B: Gamma 
function, p. 689. Here they are first used to estimate coefficients of a standard scale of 
functions, and then to prove transfer theorems for error terms in Section VI.3. With 
this basic process, an asymptotic expansion of a function near a singularity is directly 
mapped to a matching asymptotic expansion of its coefficients. 

Starting from the binomial expansion, we have for general a, 


e a n{—@ nt+ta-1 a(at+1)---(atn-1 
ena 2ye=ar([e) = ("Fenty seen). 
This quantity is expressible in terms of Gamma factors, and 
n+a-1 T(n+a) 
( n ) = T(a)T(n+1)’ 
provided a is neither 0 nor a negative integer. (When a € {0,—1,...}, the coefficients 
(a) eventually vanish, so that the asymptotic problem of estimating [z”](1 — 


(11) 


z)~* becomes void.) The asymptotic analysis of the coefficients ( ) can be 
carried out elementarily by means of Stirling’s formula and real integral estimates: 
see Notes | and 2. 


nt+ta-1 
n 


A method far more productive than elementary real analysis techniques consists 
in analysing coefficients of a function f(z) by means of Cauchy’s coefficient formula, 


h 1 dz 
oY) = aim f $2) spar 
The basic principle is extremely simple: it consists in choosing a contour of integra- 
tion y that comes at distance 4 of the singularity z = 1. Under the change of variables 
z=1+t/n, the kernel z~"~! in the integral transforms into an exponential, and the 
function can be locally expanded, with the differential coefficient only introducing a 
rescaling factor of 1/n: 


(12) 


This gives us for instance (precise justification below): 
1 
[2] =2) *~ gen? Ja = > 
2i7 
The contour and the associated rescaling “capture” the behaviour of the function near 
its singularity, thereby enabling coefficient estimation. 


e *(—t)~% dt. 
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R 
ep 2M 
(0) 1 @ | 2m 


FIGURE VI.2. The contours Co, Ci, and C2 = H(n) used for estimating the coefficients 
of functions from the standard asymptotic scale. 


Theorem VI.1 (Standard function scale). Let a be an arbitrary complex number in 
C\ Zeo. The coefficient of z” in 
flZ=G-2y% 
admits for large n a full asymptotic expansion in descending powers of n, 
nent OS ae 
n Ay 1 eed 
Efe) ~ Ta ( tee). 


where e; is a polynomial in a of degree 2k. In particular: 


“ ne a(a—1)  a(a—1)(a—2)(a-1) 

wy PY@ > Wf) 4 eee 
a? (a — 1)? (a — 2) (a — 3) 

eee... 


The quantity ex is a polynomial in a that is divisible by (a—1)---(a—k), in accordance with 
the fact that the asymptotic expansion terminates when a € Z> 1. The factor 1/I'\(@) vanishes 
when a € Z<o, in accordance with the fact that coefficients are asymptotically 0 in that case. 
PROOF. The first step is to express the coefficient [z”](1— z)~° as a complex integral 
by means of Cauchy’s coefficient formula, 

1 dz 
(14) fn = 5 ea) eee-caie 
where C is a small enough contour that encircles the origin; see Figure 2. For instance, 
we can start with C = Co, where Cy is the positively oriented circle Co = {z, |z| = 4}. 
The second step is to deform Cg into another simple closed curve C; around the origin 
that does not cross the half-line #(z) > 1: the contour C; consists of a large circle 
of radius R > 1 with a notch that comes back near and to the left of z = 1. Since 
the integrand along large circles decreases as O(R7~"~“), we can finally let R tend to 
infinity and are left with an integral representation for f,, where C has been replaced 
by a contour C2 that starts from +00 in the lower half plane, winds clockwise around 
1, and ends at +00 in the upper half plane. This is a typical case of a Hankel contour. 
A judicious choice of its distance to the half-line R> 1 yields the expansion. 
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To specify precisely the integration path, we particularize C2 to be the contour 
H(n) that passes at a distance + from the half line R>1: 


(15) H(n) =H (n) + H*(n) + H?(n) 
where 
H-(n) = {z=w-—4,w>1} 
(16) Ht(n) = {z=w+7,w2>I1} 
Hn) = {z=1-<, 6€[-F, 5}. 


Now, a change of variable 


t 
(17) g=14+— 
n 


in the integral (14) gives the form 


no-t t —n-1 
(18) n= i: (-t)-° (1+ “) dt. 


(The Hankel contour H is the same as in the proof of Theorem B.1, 691.) 
We have the asymptotic expansion 
(19) 
onl 2 4 3 2 
142 = e~ (n+) log(1+t/n) —e? jee OUT 5. 05 
n 2n 24n? , 

which tells us that the integrand in (18) converges pointwise (as well as uniformly 
in any bounded domain of the ¢ plane) to (—t)~°e~*. This quantity is precisely the 
kernel that appears in Hankel’s formula for the Gamma function (p. 691). Substitution 
of the asymptotic form 


—n-1 
t 1 
(1 + *) =e (1 + o()) ; 
n n 
as n — oo inside the integral (18) suggests (formally) that 
not 1 
"(1-2)°%= 1 —)). 
ea 2" =F (1402)) 


To justify the formal argument outlined in the previous paragraph, we proceed as 
follows: 


(i) Split the contour according to R(t) < log? n and R(t) > log? n, as in the 


corresponding diagram: 5 
log* n 


-0 2 
(ii) Verify that the part corresponding to R(t) > log? n is negligible in the scale 
of the problem. For instance, one has 


(20) 


(1 a «) = O(exp(— log” n)) for R(t) > log? n. 
n 
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n=10 n=20 n = 50 

a 
as. 18708 6935533866 — 2022877684829178931751713264 

Tm 
—3n 16603 6545410086 197 7362936920522405787299715 
+380 16815 6565051785 19782 79553371460627490749710 
—aen 1679 4 6564073885 1978261 300061101426696482732 
+ 33888 n-* 16796 6564122750 19782616 64919884629357813591 


- aan 16796 6564120 303 1978261657 612856326190245636 

een © 16796 6564120426 197826165775 9023715384519184 
— spe n') 16796 6564120420 19782616577561 03402179527600 
Cr 16796 6564120420 1978261657756160653623774456 


FIGURE VI.3. Improved approximations to the Catalan numbers obtained by successive 
terms of their asymptotic expansion. 


(iit) Use a terminating form of (19) to develop an expansion to any predeter- 
mined order, with uniform error terms, for the part corresponding to R(t) < 
log? n. (This is possible because t/n = O(log” n/n) is small.) 

These considerations validate term-by-term integration of expansion (19) within the 
integral of (18), so that the full expansion of f,, is determined as follows: A term of 
the form t” /n* in the expansion (19) induces, by Hankel’s formula, a term of the form 
n—*/T'(a—r). (The expansion so obtained is nondegenerate provided a differs from 
a negative integer or zero; see also Note 3 for details.) Since 


1 1 
FO 7 Fay? YO -2)- (a8). 


the expansion in the statement of the theorem eventually follows. 


The asymptotic approximations obtained from Theorem VI.2 differ from the ones 
that are associated with meromorphic asymptotics, (Chapter IV), where exponentially 
small error terms could be derived. However, it is not uncommon to obtain results 
with about 10~° accuracy, already for values of n in the range 10'—10? with just a few 
terms of the asymptotic expansion. Figure 3 exemplifies this situation by displaying 
the approximations obtained for the Catalan numbers, 

= 4” n —1/2 
Cn = lel 2), 
when C19, C20, C50 are considered and up to eight asymptotic terms are taken into 
account. 


> VL1. Stirling’s formula and asymptotics of binomial coefficients. The Gamma function 
form (11) of the binomial coefficients yields 


ea-2y" =F (+04), 


when Stirling’s formula is applied to the Gamma factors. dq 
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> V1.2. Beta integrals and asymptotics of binomial coefficients. A direct way of obtaining the 
general asymptotic form of Cae) bases itself on the Eulerian Beta integral (see [492, p.254] 
and APPENDIX B: Gamma function, p. 689). Consider the quantity 


ae cast n—1)! 1 
Haal= | ON ees ae ay 


where the second form results elementarily from successive integrations by parts. The change 
of variables t = x/n yields 
ee eee: ae L. [rex Bese r 
(na) == | a any dt ~ ate" dr= 2) 
no 


0 nc N® Jo no’ 


where the asymptotic form results from the standard limit formula of the exponential: exp(a) = 
limp—oo(1 + a/n)”. dq 
> VI.3. Computability of full expansions. The coefficients e, of Theorem VI.1 satisfy 


2k 


ek = So Akela — 1)(a— 2)---(a— 2), 


(=k 
where Axe := [v*tJe’(1 + ut)”, <q 


> V1.4. Oscillations and complex exponents. Oscillations occur in the case of singular expan- 
sions involving complex exponents. From the consideration of [z"](1 — z)** ¥*—1 one 


ef) 
finds 
2"]eos (Iog +) = P&E") + 04), 


=, n n? 
where P(u) is a continuous and 1—periodic function. In general, such oscillations are present 
in [z"](1 — z) © for any nonreal a. dq 


Logarithmic factors. The basic principle underlying the method of proof of The- 
orem VI.1 (see also the summary Equation (12)) has the advantage of being easily 
extended to a wide class of singular functions, most notably the ones that involve 
logarithmic terms. 


Theorem VI.2 (Standard function scale, logarithms). Let a be an arbitrary complex 
number in C \ Z<o. The coefficient of 2” in 


fe) =(1=2)-* (F08 y 


l-<z 


admits for large n a full asymptotic expansion in descending powers of log n, 


C C. 
(log n)? i+ 4 2 +], 
logn log n 


a-l 


QD h=PVO~ TS 


where Cy = (-1)* (2) ray Ts 


ga. 

A coefficient of 1/z is introduced in front of the logarithm since log(1 — z)~! = 
z+ O(z?). In this way, f(z) is a bona fide power series in z, even in cases when (3 is 
not a positive integer. 
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PROOF. The proof is a simple variant of that of Theorem VI.1 (see [199] for details). 
The basic expansion used is now 


pos bas bert ~ et (2)" (os (FY)! 


oot B 
~ e7*(—t)-?n*(logn)4 (1 se) a 


~ e '(-t)~°n® (logn)? ( - aoe + wed (“—")) +. | : 


Again, we are justified in using this expansion inside Cauchy’s integral representation 
of coefficients. What comes out from term by term integration is a collection of Hankel 
integrals of the form 
1 (0) 
— (—t)~*e~* (log(—t))* dt 
27 J 456 
which reduce to derivatives of 1/T'(s), as is seen by differentiation with respect to s 
under the integral sign. 


A typical example of application of Theorem VI.2 is the estimate 


[2"] 1 1 _ 1 ue ice cate 1 
V1—z tlgp, Van logn logn log? nn’) 


(Such singular functions do occur in combinatorics and the analysis of algorithms [209].) 


[> VLS5. Singularity analysis of slowly varying functions. A function M(w) is said to be slowly 
varying towards infinity (in the complex plane) if for any fixed 4 > 0 and all @ satisfying 
|| < m— ¢ for some ¢ € (0, $), there holds 


10u 
lim we) ) 


u—+oo M(w) =e 


(Powers of logarithms and iterated logarithms are typically slowly varying functions.) Under 
suitable uniformity assumptions, one has [199] 


1 1 qe 
22 ™\———_ M | —— } ~ —=— M(n). 
a gage M (Ge) ~ Ra MO 
exp ( + log =) lon 
For instance: [2”] : =e ( ae . See also the discussion of Tauberian 
1-—<z Vmrn 
theory, p. 416. dq 


> VIL.6. Iterated logarithms. For a general a ¢ Z<o, the relation (22) specializes to 


[e"](1 — 272 (: ice 4) (: ge (: jog 4) 2 7 (logn)* (og os n). 


A full asymptotic expansion can be derived in this case. dq 
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= Se Mae isd) (Eq.) a € {0, —1, —2,...} 
a-l1 oS co 
n B az B D; 
Tray Me" @D | few on? ape 


no iS culoen) log) 


j=0 


FIGURE VI.4. The general and special cases of fn = [z"|f(z) when f(z) is as in 
Theorem VI.2. 


Special cases. The conditions of Theorems VI.1 and VI.2 exclude explicitly the 
case when a is a negative integer: the formule actually remain valid in this case, 
provided one interprets them as limit cases, making use of 0 = 1/T'(0) = 1/T'(-1) = 

. Also, when ( is a positive integer, the expansion of Theorem VI.2 terminates: 
in that situation, stronger forms are valid. Such cases are summarized in Figure 4 and 
discussed below. 


The case of integral a € Ze<o and general 3 ¢ Z>o. When a is a negative 
integer, the coefficients of f(z) = (1 — z)~* eventually reduce to zero, so that the 
asymptotic coefficient expansion becomes trivial: this situation is implicitly covered 
by the statement of Theorem VI.1 since, in that case, 1/I'(a@) = 0. When logarithms 
are present (with a € Z<q still), the expansion of Theorem VI.2 regarding 


f(a) =(-2)"8 (£08 — ): 


remains valid provided we again take into account the equality 1/['(@) = 0 in for- 
mula (21) after effecting simplifications by Gamma factors: It is only the first term 
of (21) that vanishes, and one has 


_ D, Dz 
23 2") f(z) ~ n*! (lo | + wel, 
(23) ef) ~ n° (ogy? | + ae 
where D,, is given by Dy = (—1)* Oye For instance, we find 
k g y k= k ds* T(s) ee ? 
z 1 27 1 
2"| —__—__ = — + +O : 
| eal —z)71 nlog?n  nlog®n ost’ 


The case of general a ¢ Z<o and integral 3 € Z>o0. When (3 is a nonnegative 
integer, the error terms can be further improved with respect to the ones predicted by 
the general statement of Theorem VI.2. For instance, we have: 

1 1 1 1 1 
———log—— = et a Ee oe 

oP i oe oh ts 2n = 12n? - Ga 
ogn 

——— log ——__ ~ logn+7+2log2+O 
Fras t= aa ( ent g 5"). 
(In such a case, the expansion of Theorem VI.2 terminates since only its first (k + 
1) terms are nonzero.) In fact, in the general case of nonintegral a, there exists an 
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expansion of the form 


1 not Fi (log n 
(24) [z"](1 — 2)“ log® —— ~ T(a) Hp(logn) + FHl08M) 5 
where the F; are polynomials of degree k, as can be proved by adapting the argument 
employed for general a (see also Note 8). 

The joint case of integral w € Z<o and integral 3 € Z>o. If a is a negative inte- 
ger, the coefficients appear as finite differences of coefficients of logarithmic powers. 
Explicit formule are then available elementarily from the calculus of finite differences 
when (3 is a positive integer. For instance, with a = —r for r € Z>o, one has 


1 r! 
25 "(1 — 2)" log —— = (—1)”" ——————_. 
(25) ee Oe hee ( a) 
The case a = —r and 8 = k (with r,k € Zs0) is covered by (27) in Note 7 below: 
there is a formula analogous to (24), 


Fy (1 
vant Fideee pe he, ; 
n 


(26) = [z"](1— z)" log” — 


but now with deg(F;) =k—1. 

A table of the asymptotic form of coefficients of a few standard functions illus- 
trating Theorems VI.1 and VI.2 as well as some of the “special cases” is given in 
Figure 5. 


> VI.7. The method of Frobenius and Jungen. This is an alternative approach to the case 
B € Zso (see [287]). Start from the observation that 


Gaia (108 5) = 70 ene 


and allow the operators of differentiation (0/0a: ) and coefficient extraction ([z”] ) to commute— 
this can be justified by Cauchy’s coefficient formula upon differentiating under the integral 
sign—, which yields 


i yy ae ai _ OF Tinta) 
(27) [z"|(1 — z) (108 cm = Ba¥ F(a\r(n +1)’ 


and leads to an “exact” formula (Note 8 below). J 


> VL.8. Shifted harmonic numbers. Define the a-shifted harmonic number by 


n—1 
1 
hn(@) := y - : 
j=0 poe 


Set L(z) := — log(1 — z). Then, one has 
Era 2*L@) = ("FED pala) 
e"a-2y ne? = [*to~ 1) (h(a) + hn(a)?) . 


(Note: hn(a) = w(a+n) — (a), where y(s) := Os log I'(s).) In particular, 


ay 1 1 (2n 
[z l7S be 7 (™)ieH An], 
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Function Coefficients 


SO 
32n = 5512? 


3 25 
+ én + a56n2 + Olps 
y+2log2—2 logn 
a = a )) 
ee ep iy 3 
3P(3)n4/3 9n — 81n? 
1 jn 2 mn — 6y? 1 


nlog?n\— logn  2log?n log? n 


+O 


)) 


1 
eee a OC) 


1 1 5 1 

eB! (coe ie SAE ee 
( 8 ag 128n? € 1024n3 - (ra) 
] 2 log 2 
(logn +7 + 2log2— SE" 717 AE" 


1 
VTn 
1 


1 1 
1 EB ye Ne he 
REY Tae tog One 


log? n + 2ylogn +7? — 7 oreen 
3 7 
64n? 


+ 2y + 4log2—44 


n 


3logn 
4n 


+0(-)) 


1 1 
nlogn+(y—1)n+logn + 5 EOE) 


2 
] 
n(log? n + 2(7 ~ 1) logn +? — 27 +2- + Of cout 


tn? +3n+1 


FIGURE VI.5. A table of some commonly encountered functions (with L(z) := 


log(1/(1 — z))) and the asymptotic forms of their coefficients. 


where H,, = h,(1) is the usual harmonic number. <J 


VI.3. Transfers 


Our general objective is to translate an approximation of a function near a sin- 


gularity into an asymptotic approximation of its coefficients. What is required at this 
stage is a way to extract coefficients of error terms (known usually in O(-) or o(-) 
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FIGURE VI.6. A A-domain and the contour used to establish Theorem VI.3. 


form) in the expansion of a function near a singularity. This task is technically simple 
as a fairly coarse analysis suffices. Like in the previous section, it relies on contour 
integration by means of Hankel-type paths; see for instance the summary in Eq. (12) 
above. 

A natural extension of the approach of the previous section is to assume the error 
terms valid in the complex plane slit along the real half line R>,. In fact weaker 
conditions suffice and any domain whose boundary makes an acute angle with the 
half line R>, appears to be suitable. 


Definition VI.1. Given two numbers ¢,R with R > 1 and0 < @ < 4, the open 
domain A(¢, R) is defined as 

A(¢, R) = {z | |2|< RB, z #1, | Arg(z—1)| > 9}. 
A domain is a A-domain if it is a A(¢, R) for some R and ¢. A function is A—analytic 
if it is analytic in some A—domain. 


Analyticity in a A-domain (Figure 6, left) is the basic condition for transfer to 
coefficients of error terms in asymptotic expansions. 


Theorem VI.3 (Transfer, Big-Oh and little-oh). Let a, 3 be arbitrary real numbers, 
a, 8 € Rand let f(z) be a function that is A—analytic. 

(i) Assume that f(z) satisfies in the intersection of a neighbourhood of 1 with its 
A-—domain the condition 


f(2)=0 ((1~ 2)" “(og ——)") 
Then one has: [2"] f(z) = O(n*! (log n)*). 


(it) Assume that f(z) satisfies in the intersection of a neighbourhood of 1 with its 
A-—domain the condition 


fle) =0( = 2)-*ox +"). 


Then one has: [2"| f(z) = o(n®1 (log n)¥). 
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PROOF. (2) The starting point is Cauchy’s coefficient formula, 
i 1 dz 
fu= VC) = a5 f $2) spar 
where 7¥ is any simple loop around the origin which is internal to the A—domain of f. 


We choose the positively oriented contour (Figure 6, right) y = y1 U y2 U 73 U Ya, 
with 


1 = {z | jz-1|= -, | Arg(z — 1)| > 6} } (inner circle) 

We ta AR | Z <|z-1|, |z| <r, Arg(z-—1) =0} (rectilinear part, top) 

ya = {z | |z| =r, | Are(z—1)| > 6]} (outer circle) 

; ae | = <|z-1|, |z| <r, Arg(z—1) =—6} (rectilinear part, bottom). 


If the A domain of f is A(@, R), we assume that 1 < r < R, and ¢ < @ < §, so that 
the contour ¥ lies entirely inside the domain of analyticity of f. 
For 7 = 1, 2,3, 4, let 


1 dz 
(J) — ee 
rES at iT [ f(z) ynt1° 


The analysis proceeds by bounding the absolute value of the integral along each of 
the four parts. In order to keep notations simple, we detail the proof in the case where 


B=0. 


(1) Inner circle (7). From trivial bounds, the contribution from +; satisfies 
1 
)) = O(=)-O —)y-"“|)}=0O a-1 
(| =O) (2) ) (n=), 


as the function is O(n”) (by assumption on f(z)), the contour has length 
O(n-*), and z~"~+ remains O(1) on this part of the contour. 

Rectilinear parts (72,y4). Consider the contribution FP arising from the 
part 72 of the contour. Setting w = e’’, and performing the change of 
variable z = 1+ “4, we find 


Wise fo K a 
no’ Qn Sy n 


for some constant K > 0 such that |f(z)| < K(1— z)~°% over the A— 
domain, which is granted by the growth assumption on f. From the relation 


(2 


wm 


—n-1 
dt 


t 
if 
n 


y 


t t t 
p+ Sane) Hae eee, 
n nm n 


there results the inequality 


K oP tcos6\” 
[F< ayn}, where Jn = / ta (1+ =) dt. 
27 1 n 
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For a given a, the integrals J,, are all bounded above by some constant since 
they admit a limit as n tends to infinity: 


oo 
In a3 | tr %e7 £608 9 dt. 
1 


The condition on @ that 0 < 6 < 4 precisely ensures convergence of the 
integral. Thus, globally, on the part yz of the contour, we have 


[in| = O(n?*). 
A similar bound holds for 6 relative to ya. 


(3) Outer circle (y3). There, f(z) is bounded while z~” is of the order of r~”. 
Thus, the integral fF) is exponentially small. 


In summary, each of the four integrals of the split contour contributes O(n°~ +). The 
statement of Part (7) of the theorem thus follows. 


(ii) An adaptation of the proof shows that o(.) error terms may be translated 
similarly. All that is required is a further breakup of the rectilinear part at a distance 
log? n/n from 1 (see Equation (20) or [199] for details). 


An immediate corollary of Theorem VI.3 is the possibility of transferring asymp- 
totic equivalence from singular forms to coefficients: 


Corollary VI.1 (sim-transfer). Assume that f(z) is A-analytic and 


fia~d-z)%, asz—1, zeéA, 
with a ¢ {0, —1, —2,---}. Then, the coefficients of f satisfy 
a-1 
ws n 
Ee) 


PROOF. It suffices to observe that, with g(z) = (1 — z)~%, one has 


fle)~g) iff f(z) =9(z) + o(g(z)), 


then apply Theorem VI.1 to the first term, and Theorem V1.3 (little-oh transfer) to the 
remainder. 


> VL9. Transfer of nearly polynomial functions. Let f(z) be A-singular and satisfy the singu- 
lar expansion f(z) ~ (1 — z)", where r € Zso. Then, fn = 0(n~"~'). [This is also a direct 
consequence of the little-oh transfer.] << 


> VI.10. Transfer of large negative exponents. The A-analyticity condition can be weakened 
for functions that are large at their singularity. Assume that f(z) is analytic in the open disk 
|z| < 1, and that in the whole of the open disk it satisfies 


f(z) = O((1—z)~“). 
Then, provided a > 1, one has 
[2"]f(z) = O(n"). 


(Hint. Integrate on the circle of radius 1 — -; see also [199].] J 
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VI.4. The process of singularity analysis 


In Sections VI. 2 and VI. 3, we have developed a collection of statements granting 
the existence of correspondences between properties of a function f(z) singular at an 
isolated point (z = 1) and the asymptotic behaviour of its coefficients f,, = [z”] f(z). 
Using the symbol ‘e-»’ to represent such a correspondence’, we can summarize some 


of our results relative to the scale S = {(1 — z)~°, a€ C \ Z<o} as follows: 


= (1-2)? o> f,= +--+ (Theorem VI.1) 


(z) Tl 
f(z) =O(A—2z)-%) e—» fn = O(n™"1) (Theorem VI.3 (z)) 
f(z) =0(1—2z)-*) ef, = 0(n?") (Theorem V1.3 (ii)) 
(z) 


~ (1l—z)7? o— > frr~ (Cor. VI.1). 


Ta) 
The important requirement is that the function should have an isolated singularity (the 
condition of A—analyticity) and that the asymptotic property of the function near its 
singularity should be valid in an area of the complex plane extending beyond the disk 
of convergence of the original series, (in a A-domain). Extensions to logarithmic 
powers and special cases like a € Z<g are also, as we know, available. We let S 
denote the set of such singular functions: 


(28) S={(1- z)~*d(z)8 | a, BEC}, A(z) = = log : : 


1l-z 


At this stage, we thus have available tools by which, starting from the expansion 
of a function at its singularity, also called singular expansion, one can justify the term- 
by-term transfer from an approximation of the function to an asymptotic estimate of 
the coefficients. We state: 


Theorem VI.4 (Singularity analysis, single singularity). Let f(z) be function analytic 
at 0 with a singularity at ¢, such that f(z) can be continued to a domain of the form 
¢- Ao, for a A-domain Ao, where ¢ - Ao is the image of Ao by the mapping z > Cz. 
Assume that there exist two functions 0,7, where a is a (finite) linear combination of 
functions in S and T € S, so that 


f(z) =0(2/0)+O(r(2/6)) as 26 in C- Ao. 


Then, the coefficients of f(z) satisfy the asymptotic estimate 
fn =F "On + OC "Th ); 


where on, = [z"]o(z) has its coefficients determined by Theorems VI.1, VI.2 and 
T =n (logn)®, ifr(z) = (1— z)7*A(z)?. 
We observe that the statement is equivalent to 7 = [z”]r(z), except when a € Z<o (when 


the 1/I’(a) factor should be omitted). Also, generically, we have 7, = o(on), so that orders of 
growth of functions at singularities are mapped to orders of growth of coefficients. 


The symbol ‘==>’ represents an unconditional logical implication and is accordingly used in this 
book to represent the systematic correspondence between combinatorial specifications and generating func- 
tion equations. In contrast, the symbol ‘e—»’ represents a mapping from functions to coefficients, under 
suitable analytic conditions as stated in Theorems VI.1-VL3. 
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Let f(z) be a function analytic at 0 whose coefficients are to be asymptotically analysed. 
1. Preparation. This consists in locating dominant singularities and checking analytic continu- 
ation. 
la. Locate singularities. Determine the dominant singularities of f(z) (assumed not to 
be entire). Check that f(z) has a single singularity ¢ on its circle of convergence. 
1b. Check continuation. Establish that f(z) is analytic in some domain of the form ¢Ao. 


2. Singular expansion. Analyse the function f(z) as z — ¢ in the domain ¢Ao and determine 
in that domain an expansion of the form 


F(2) =, 72/9 + O(r(z/¢)) with (2) K of2). 
For the method to succeed, the functions o and 7 should belong to the standard scale of func- 
tions S = {(1 — z)~*A(z)*}, with A(z) = 27? log(1 — z)7!. 
3. Transfer Translate the main term term o(z) using the catalogues provided by TheoremsVI.1 
and VI.2. Transfer the error term (Theorem VI.3) and conclude that 


EY), = oon +O (Cr), 


n= 


where on = [z”]o(z) and 7, = [z"]r(z) provided the corresponding exponent a ¢ Z<o 
(otherwise, the factor 1/I'(a) = 0 should be dropped). 


FIGURE VI.7. A summary of the singularity analysis process (single dominant singularity). 


PROOF. The normalized function g(z) = f(z/C) is singular at 1. It is A-analytic and 
satisfies the relation g(z) = o(z) + O(7(z)) as z — 1 within Ap. Theorem VL.3, (2) 
(the big-Oh transfer) applies to the O-error term. The statement follows finally since 
2") f(z) =¢-"[2"]g(2). 

The statement of Theorem VI.4 can be concisely expressed by the correspon- 
dence: 


29) f(z) = o(2/Q+O0(r(z/) fn = ont OO"), 


z 


The conditions of analytic continuation and validity of the expansion in a A—domain 
are essential. Similarly, we have 
(30) f(z) =o (2/0) +0(7 (2/0) fn = on FO), 
as a simple consequence of Theorem VI.3, part (27) (little-oh transfer). The map- 
pings (29) and (30) supplemented by the accompanying analysis constitute the heart 
of the singularity analysis process summarized in Figure 7. 

Many of the functions commonly encountered in analysis are found to be A— 
continuable. This fact results from the property of the elementary functions (like as 
log, tan) to be continuable to larger regions than what their expansions imply, as well 
as to the rich set of composition properties that analytic functions satisfy. Also, asymp- 
totic expansions at a singularity initially determined along the real axis by elementary 
real analysis often hold in much wider regions of the complex plane. The singularity 
analysis process is then likely to be applicable to a large number of generating func- 
tions that are provided by the symbolic method—most notably the iterative structures 
described in Section IV. 4 (p. 236). In such cases, singularity analysis greatly refines 
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the exponential growth estimates obtained in Theorem IV.8 (p. 237). The condition is 
that singular expansions should be of a suitably moderate? growth. We illustrate this 
situation now by treating combinatorial generating functions obtained by the symbolic 
methods of Chapters I and II, for which explicit expressions are available. 


EXAMPLE VI.2. Asymptotics of 2-regular graphs. This example completes the discussion of 
Example 1, p. 363. The labelled class C of 2-regular graphs satisfies 


2 
C = SET(UCYC33(Z)) => C(z) = exp (5 (ios 2b) ee =)) , 
where UCYC is the undirected cycle construction (Note II.21, p. 124). For this example, we 

follow step by step the singularity analysis process as summarized in Figure 7. 
—2/2—27/4 (that is entire) and 
of (1 — zy ? (that is analytic in the unit disk) is itself analytic in the unit disk. Also, since 
(1-2) “Wig A-analytic (it is well-defined and analytic in the complex plane slit along R>1), 
C(z) is itself A—analytic, with a singularity at z = 1. 

2. Singular expansion. The asymptotic expansion of C'(z) near z = 1 is obtained starting 
—2/2—27/4 


1. Preparation. The function C(z) being the product of e 


from the standard (analytic) expansion of e atz = 1, 


—3/4 e73/4 


(ene 


ee ee eee VAG easy aie 


The factor (1 — ayy 2 is its own asymptotic expansion, clearly valid in any A—domain. Per- 
forming the multiplication yields a complete expansion, 


e73/4 ahi e7 3/4 5 e7 3/4 5/2 
(31) OO ee Vl-—2zt+ Z (1-2) =e a ae 


out of which terminating forms can be extracted. 

3. Transfer. Take for instance the expansion of (31) limited to two terms plus an error 
term. The singularity analysis process allows the transfer of (31) to coefficients, which we can 
present in tabular form as follows: 


a ~3/4 
e73/4 1 -3/4(m—1/2) ee eee ee eae 
Vl-—2z —1/2 Van 8n — 128n? 

3/4 
pe 4 Toe | 4e73/4(7— 3/2) me 1, 3, 
eT ® | +6 ( 3/0.) teas Bae 
4 _ y3/2) | 1 
O((1 — 2)""") | +O(—FR). 


Terms are then collected with expansions suitably truncated to the coarsest error term, so that 
here a 3-term expansion results. In the sequel, we shall no longer need to detail such compu- 
tations and we shall content ourselves with putting in parallel the function’s expansion and the 
coefficient’s expansion like in the following correspondence: 


e 3/4 3/4 Ts 3/2 e 3/4 5e 3/4 1 


3For functions with fast growth at a singularity, the saddle-point method developed in Chapter VIII 
becomes effectual. 
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Here is a numerical check. Set hb = e 3/4 i ./7n and let 2) represent the sum of the first 
two terms of the expansion of c,,. One finds: 


n|5 50 500 
14.30212 1.1462888618-10%  1.4542120372 - 1011? 


12.51435 1.1319602511-10°  1.4523942721 - 101!°%? 
1.1319677968 -10™ 1.4523943224 - 10 


Clearly, a complete asymptotic expansion in descending powers of n can be obtained in this 
WAYS -daserancatet aes Lomthastanerens ede eeatiaetig Giese TURE END OF EXAMPLE VI.2. 


EXAMPLE VI.3. Asymptotics of unary—binary trees and Motzkin numbers. Unary-binary trees 
are unlabelled plane trees that admit the specification and OGF: 


P= BO. a ee VO), 


2z 
(See Note 1.36 (p. 63) and Subsection V. 3 (p. 295) for the lattice path version.) The GF U(z) 
is singular at z = —1 and z = z, the dominant singularity being at z = 3. By branching 


properties of the square-root function, U(z) is analytic in a A-domain like the one depicted 
below: 


Around the point 3, a singular expansion is obtained by multiplying (1 — 32z)1/ ? and the 
analytic expansion of the factor (1 + z)'/?/(2z). The singularity analysis process then applies 
and yields automatically: 


U(z) =1-3'?V1—3z+O0((1-32z)) e—» Un= _ 3” + 0(3"n*). 
Further terms in the singular expansion of U(z) at z = 3 provide additional terms in the 


asymptotic expression of the Motzkin numbers U7, for instance, 


me / 3 gr (4 _ Ee 505 8085 Ht 505659 40 1 
"WV 4rn3 16n  512n? 8192n>  524288n4 n> 


results from an expansion of U(z) till O((1 — 3z)!1/?). 0.2... END OF EXAMPLE V1.3. 


EXAMPLE VI.4. — Asymptotics of children’s rounds. Stanley [445] has introduced certain 
combinatorial configurations that he has nicknamed “children’s rounds”: a round is a labelled 
set of directed cycles, each of which has a center attached. The specification and EGF are 


R=Set(Z*Cyc(Z)) —> R(z) exp (< tos +) wee ec 
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The function R(z) is analytic in the C-plane slit along R>1, as is seen by elementary properties 
of the composition of analytic functions. The singular expansion at z = 1 is then mapped to an 
expansion for the coefficients: 


R(2) = —— + log(1 - 2) + O((1 2)/?) e—» "]R(z) =1- = + O(n*?). 


A more detailed analysis yields 


dy 2-2 log? 
+ — L(ogn+7-1) +0 (#5), 


and an expansion to any order can be easily obtained. .......... END OF EXAMPLE V1.4. 


[e"|R(z) =1- 


> VI.11. The asymptotic shape of the rounds numbers. A complete asymptotic expansion has 


the form 
P; (log n) 


no? 


[2"|R(z) ~1- Se 

j21 

where P; is a polynomial of degree 7 — 1. (The coefficients of P; are rational combinations of 
powers of 7, ¢(2),...,¢(j — 1).) The successive terms in this expansion are easily obtained by 


a computer algebra program. dq 


EXAMPLE VI.5. Asymptotics of coefficients of an elementary function. Our final example 
is meant to show the way rather arbitrary compositions of basic functions can be treated by 
singularity analysis. Let C = Z * SEQ(C) be the class of general labelled plane trees. Consider 
the labelled class defined by substitution 


F =CoCyc(Cyc(Z)) = F(z) =C(L(L(z))). 


There, C(z) = $(1— V1 — 42) and L(z) = log +. Combinatorially, F is the class of trees 
in which nodes are replaced by cycles of cycles, a rather artificial combinatorial object, and 
1 


F(z) = s{1- 1 —Alog 


2 Soe 


The problem is first to locate the dominant singularity of F(z), then to determine its nature, 
which can be done inductively on the structure of F'(z). The dominant positive singularity p of 
F(z) satisfies L(L(p)) = $ and one has 


-1/4 


p=1—-e& § ~'=0.198443, 


given that C(z) is singular at + and L(z) has positive coefficients. Since L(L(z)) is analytic at 
p, a local expansion of F’(z) is obtained next by composition of the singular expansion of C'(z) 
at + with the standard Taylor expansion of L(L(z)) at p. We find 


F@)=)-Cp-2)"7100-9) ss pira- S22 +012), 
2 2/a4n3 n 
Ciena Op sOe atid re fo END OF EXAMPLE VL.5. 


> VI.12. The asymptotic number of trains. Combinatorial trains have been introduced in Sec- 
tion IV. 4 as a way to exemplify the power of complex asymptotic methods. One finds that, at its 
dominant singularity p, the EGF Tr(z) is of the form Tr(z) ~ C/(1— z/p), and, by singularity 
analysis, 

[z”|Tr(z) ~ 0.11768 31406 15497 - 2.06131 73279 40138”. 
(This asymptotic approximation is good to 15 significant digits for n = 50, in accordance with 
the fact that the dominant singularity is a simple pole.) <q 
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VI.5. Multiple singularities 


The previous section has described in detail the analysis of functions with a single 
dominant singularity. The extension to functions that have finitely many (by necessity 
isolated) singularities on their circle of convergence follows along entirely similar 
lines. It parallels the situation of rational and meromorphic functions in Chapter IV 
(p. 250) and is technically simple, the net result being: 

In the case of multiple singularities, the separate contributions from each of 
the singularities, as given by the basic singularity analysis process, must be 
added up. 

Like in (28), we let S be the standard scale of functions singular at 1, namely 

1 
l-z 
Theorem VI.5 (Singularity analysis, multiple singularities). Let f(z) be analytic in 
|z| < p and have a finite number of singularities on the circle |z| = p at points 
Ga pei, for j =1..r. Assume that there exists a A-domain Ao such that f(z) is 
analytic in the indented disc 


S = {(1- 2) *X(z)? | a,BeECh}, N(z) = log 


D=(\(G- Ao), 


j=l 
with ¢ - Ao the image of Ao by the mapping z > Cz. 
Assume that there exists r functions 01,...,0,, each a linear combination of 


elements from S and a function T € S such that 


f(z) = 05 (z/Gj) + O (r(z/¢;)) as z— ¢; inD. 


Then the coefficients of f(z) satisfy the asymptotic estimate 


> 

fa = S- Gj "Oj. +O (p-) ; 

j=l 

where each oj, = [2"]o;(Z) has its coefficients determined by Theorems VI.1, VI.2 
and r* = n*(logn)’, if r(z) = (1 — z)72A(z)®. 
A function analytic in a domain like D is sometimes said to be star-continuable, a notion that 
is the natural generalization of A—analyticity for functions with several dominant singularities. 
Also, a similar statement holds with o-error terms replacing O’s. 
PROOF. Like in the case of a single singularity, the proof bases itself on Cauchy’s 
coefficient formula 


f=" | 1) sear 


where a composite contour y depicted on Figure 8 is used. Estimates on each part of 
the contour obey exactly the same principles as in the proof of Theorems VI.1-VI.3. 
Let 7“) be the open loop around ¢ ; that comes from the outer circle, winds about ¢; 
and joins again the outer circle; let r be the radius of the outer circle. 


— The contribution along the arcs of the outer circle is O(r~”), that is, expo- 
nentially small. 
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FIGURE VI.8. Multiple singularities (r = 3): analyticity domain (D, left) and compos- 
ite integration contour (7, right). 


— The contribution along the loop y“) (say) separates into 


1 dz 
= I’ WEL 
2in (1) (2) gntl a 
a} dz i 1 dz 
I’ | — I = —_ => : 
21m Jy) o1(2/61) grt? 2in iZ (F(@) ~ on(2/61)) gril 


The quantity J’ is estimated by extending the open loop to infinity by the 
same method as in the proof of Theorems VI.1 and VI.2: it is found to equal 
¢y "o1,n plus an exponentially small term. The quantity I”, corresponding 
to the error term, is estimated by the same bounding technique as in the 
proof of Theorem VI.3 and is found to be O(p" 7). 


Collecting the various contributions completes the proof of the statement. 


Theorem VI.5 expresses, that in the case of multiple singularities, each domi- 
nant singularity can be analysed separately; the singular expansions are then each 
transferred to coefficients, and the corresponding asymptotic contributions are finally 
collected. Two examples illustrating the process follow. 


EXAMPLE VI.6. An artificial example. Let us demonstrate the modus operandi on the simple 
function 


z 


e€ 
(32) z) = ——.. 
= age 
There are two singularities at z = +1 and z = —1, with 
-1 
e e€ 


z7>-1. 


DA Spee EE iis 


The function is clearly star-continuable with the singular expansions valid in the star domain. 
We have 


me eg ee” 
Clips oe OO eee oe 


To get the coefficient [z”]g(z), it suffices to add up these two contributions (by Theorem VL5), 
so that 


[2"]9(z) ~ 


1 n —-1 
oe ae e |. 


VI. 6. INTERMEZZO: FUNCTIONS OF SA-CLASS 383 


If expansions at +1 (respectively —1) are written with an error term, which is of the form 
O((z — 1)'/?) (respectively, O((z + 1)1/°), there results an estimate of the coefficients gn = 
[z”]g(z), which can be put under the form 


cosh(1) —3/2 _ sinh(1) —3/2 
Gan = a O (n ) : Gant = ae + O (n ) . 


This makes explicit the dependency of the asymptotic form of gy on the parity of the index n. 
Clearly a full asymptotic expansion can be obtained. ........... END OF EXAMPLE VI.6. 


EXAMPLE VI.7. Permutations with cycles of odd length. Consider the specification and EGF 


F =SeEt(CyCoaa(Z))  -—> F(z) = exp (5 log ==) = J tHe, 
2 l-2z l-<z 


The singularities of f are at z = +1 and z = —1, the function being obviously star-continuable. 
By singularity analysis (Theorem VI.5), we have automatically: 
gi/2 O ( ) 1/2 ( ) /2 
+ ( l-z ) z—ol 9! ; 
F(z)= l-z o> [2"|F(z) = +O (n-*) . 
o ((1+2)"/?) (Gee ve 
For the next asymptotic order, the singular expansions 
gi/? 93/2 I z+O((1 yore) ( 1) 
- Vv1—- —2Z Zz 
F(z) = V 1 = z 
2-2. /T +24 0((14 z)?/?) (23-1) 
yield 
7 91/2 _1)"97-3/2 a 
[2"|F(z) = = eDEEE eo), 
VTn rn 
This example illustrates the occurrence of singularities that have different weights, in the sense 
of being associated with different exponents. .................- END OF EXAMPLE VI.7. 


The discussion of multiple dominant singularities ties well with the earlier dis- 
cussion of Subsection IV. 6.1, p. 250. In the periodic case where the dominant singu- 
larities are at roots of unity, different regimes manifest themselves cyclically depend- 
ing on congruence properties of the index n, like in the two examples above. When 
the dominant singularities have arguments that are not commensurate to 7 (a com- 
paratively rare situation), aperiodic fluctuations appear, in which case the situation is 
similar to what was already discussed, regarding rational and meromorphic functions, 
in Subsection IV. 6.1. 


VI.6. Intermezzo: functions of singularity analysis class 


Let us say that a function is of singularity analysis class, or SA-class for short, 
if its satisfies the conditions of singularity analysis, as expressed by Theorem VI.4 
(single dominant singularity) or Theorem VI.5 (multiple dominant singularities). The 
property of being of SA-class is preserved by several basic operations of analysis: we 
have already seen this feature in passing, when determining singular expansions of 
functions obtained by sums, products, or compositions in Examples 2-5. 
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As a starting example, it is easily recognized that the assumptions of A-analyticity 


for two functions f(z), g(z) accompanied by the singular expansions 


f(z) ~ ed 27%, gz) ~ a — 2), 


and the condition a, 6 ¢ Z<o imply for the coefficients of the sum 


no-t 
Fla) a>o 
"| (f(z) +9(2))~2 (e+ an a=, ct+d £0 
no-1 
ITH a<o. 
Similarly, for products, we have 
note-1 


[2"] (f(2)g(2)) ~ Arana’ 


provided a +6 ¢ Z<o. 


The simple considerations above illustrate the robustness of singularity analysis. 


They also indicate that properties are easy to state in the generic case where no nega- 
tive integral exponents are present. However, if all cases are to be covered, there can 
easily be an explosion of the number of particular situations, which may render some- 
what clumsy the enunciation of complete statements. Accordingly, in what follows, 
we shall largely confine ourselves to generic cases, as long as these suffice to develop 
the important mathematical technique at stake for each particular problem. 


In the remainder of this chapter, we proceed to enlarge the class of functions 


recognized to be of SA-class, keeping in mind the needs of analytic combinatorics. 
The following types of functions are treated in later sections. 


— Inverse functions (Section VI.7). The inverse of an analytic function is, un- 


der mild conditions, of SA-class. In the case of functions attached to simple 
varieties of trees (corresponding to the inversion of y/d(y)), the singular 
expansion invariably has an exponent of 4 attached to it (a square-root sin- 
gularity). This applies in particular to the Cayley tree function, in terms of 
which many combinatorial structures and parameters can be analysed. 
Polylogarithms (Section VI.8). These functions are the generating func- 
tions of simple arithmetic sequences like (n°) for an arbitrary 6 € C. The 
fact that polylogarithms are of SA-class opens the possibility of estimat- 
ing a large number of sums, which involve both combinatorial terms (e.g., 
binomial coefficients) and elements like \/n and logn. Such sums appear 
recurrently in the analysis of cost functionals of combinatorial structures 
and algorithms. 

Composition (Section VI.9). The composition of functions of SA-class of- 
ten proves to be itself of SA-class. This fact has implications for the analysis 
of composition schemas and makes possible a broad extension of the super- 
critical sequence schema treated in Section V. 4, (p. 313). 
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— Differentiation, integration, and Hadamard products (Section VI. 10). These 
are three operations on analytic function that preserve the property for a 
function to be of SA-class. Applications are given to tree recurrences and to 
multidimensional walk problems. 


A main theme of this book is that elementary combinatorial classes tend to have 
generating functions whose singularity structure is strongly constrained—in many 
cases, singularities are isolated. The singularity analysis process is then a prime tech- 
nique for extracting asymptotic information from such generating functions. 


VI.7. Inverse functions 


Recursively defined structures lead to functional equations whose solutions may 
often be analysed locally near singularities. An important case is the one of func- 
tions defined by inversion. It includes the Cayley tree function as well as all generat- 
ing functions associated to simple varieties of trees (Subsections I.5.1 (p. 61), H.5.1 
(p. 116), and III. 6.2 (p. 182)). A common pattern in this context is the appearance 
of singularities of the square-root type, which proves to be universal amongst a broad 
class of problems involving trees and tree-like structures. Accordingly, by singularity 
analysis, the square-root singularity induces subexponential factors of the asymptotic 
form n—*/? in coefficients’ expansions. 


Inverse functions. Singularities of functions defined by inversion have been lo- 
cated in Subsection IV.7.1 (p. 261) and our treatment will proceed from there. The 
goal is to estimate the coefficients of a function defined implicitly by an equation of 
the form 


(33) y(z) = z6(y(z)) or equivalently z= 


The problem of solving (33) is one of functional inversion: we have seen (Lem- 
mas IV.2 and IV.3, pp. 262—263) that an analytic function admits locally an analytic 
inverse if and only if its first derivative is nonzero. We operate here under the following 
assumptions: 


— Condition (H,). The function ¢(w) is analytic at u = 0 and satisfies 


(34) 0(0) £0, [u"]e(u) 20, o(u) # bo + dru. 


(As a consequence, the inversion problem is well defined around 0. The 
nonlinearity of ¢ only excludes the case d(u) = ¢o9 + ¢1u, corresponding 
to y(z) = bo2/(1— 412).) 

— Condition (Hz). Within the open disc of convergence of ¢ at 0, |z| < 
R, there exists a (necessarily unique) positive solution to the characteristic 
equation: 


(35) ar,0<7T<R, o(r)—7H(r) =0. 


(Existence is granted as soon as lim «¢’(x)/d(x) > lasx > R-,with R 
the radius of convergence of ¢ at 0; see Proposition IV.5, p. 264.) 
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FIGURE VI.9. The images of concentric circles by the mapping y +> z = ye ¥. It 
is seen that y +> z = ye ” is injective on |y| < 1 with an image extending beyond the 
circle |z| = e~* [in grey], so that the inverse function y(z) is analytically continuable in a 
A-—domain around z = e~’. Since the direct mapping ye" is quadratic at 1 (with value 
e~'), the inverse function has a square-root singularity at e~' (with value 1). 


Then (by Proposition IV.5, p. 264), the radius of convergence of y(z) is the corre- 
sponding positive value p of z such that y(~) = T, that is to say, 


T 1 


p SS _— SS 

oT) (7) 
We start with a calculation indicating in a plain context the occurrence of a square-root 
singularity. 


(36) 


EXAMPLE VI.8. A simple analysis of the Cayley tree function. The situation corresponding 
to the function ¢(u) = e”, so that y(z) = ze” (defining the Cayley tree function T(z)), is 
typical of general analytic inversion. From (35), the radius of convergence of y(z) is p = e* 
corresponding to 7 = 1. The image of a circle in the y—plane, centered at the origin and having 
radius r < 1, by the function ye ” is a curve of the z—plane that properly contains the circle 
|z| = re" (see Figure 9) as ¢(y) = e”, which has nonnegative coefficients, satisfies 


|(re") < &(r) for all 6 € [—a, +7], 


the inequality being strict for all 9 # 0. The following observation is the key to analytic 
continuation: Since the first derivative of y/(y) vanishes at 1, the mapping y — y/¢(y) 
is angle-doubling, so that the image of the circle of radius 1 is a curve C that has a cusp at 
p =e". (See Figure 9; Notes 16 and 17 provide interesting generalizations.) 

This geometry shows that the solution of z = ye ” is uniquely defined for z inside C. 
Thus, y(z) is A-analytic. A singular expansion for y(z) is then derived from reversion of the 
power series expansion of z = ye ¥. We have 

-1 1 -1 


=e" — S—(y-1' + SG - 18 - Sw - t+, 


so that solving for y gives 


y—1= V8(1 — e2)? + Z(1— ex) + O((1 — e2)*%), 
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and a full expansion can be obtained. ....................00005 END OF EXAMPLE VL.8. 


Analysis of inverse functions. The calculation of Example 8 now needs to be 
extended to the general case, y = zd(y). This involves three steps: (7) all the dom- 
inant singularities are to be located; (ii) analyticity of y(z) in a A-domain must be 
established; (iz) the singular expansion, obtained formally so far and involving a 
square-root singularity, needs to be determined. Step (i) requires a special discussion 
and is related to periodicities. 

A simple example like ¢(u) = 1 + u? (binary trees), for which 


1— V1 —-42? 
Qz , 

shows that y(z) may have several dominant singularities—here, two conjugate singu- 
larities at —4 and +4. The conditions for this to happen are rather simple. Let us say 
that a function analytic at 0, f(u), is p-periodic if f(u) = u"g(u”) for some power 
series g (see p. 252). A function is called periodic if it is p-periodic from some p > 2 
and aperiodic otherwise. An elementary argument developed in Note 15 below shows 
that that periodicity does not occur for y(z) unless ¢(w) is itself periodic, a case which 
turns out to be easily reducible to the aperiodic situation. 


y(z) = 


Theorem VI.6 (Singular Inversion). Let ¢ be a nonlinear function satisfying the con- 
ditions (H1) and (H2) of Equations (34) and (35), and let y(z) be the solution of 
y = zO(y) satisfying y(0) = 0. Then, the quantity p = 7/(7) is the radius of con- 
vergence of y(z) at 0 (with T the root of the characteristic equation), and the singular 
expansion of y(z) near p is of the form 


a) ad 1—2/ ce 1)d; (1—2/p)?, dy = se 


j22 ae (7 


with the d; being some computable constants. 
Assume that, in addition, © is aperiodic’. Then, one has 


erwe)~ (Seem (48). 


for a family e;, of computable constants. 


PROOF. Proposition IV.5, p. 264, shows that p is indeed the radius of convergence 
of y(z). The Singular Inversion Lemma (Lemma IV.3, p. 263) also shows that y(z) 
can be continued to a neighbourhood of p slit along the ray Rs». 

The singular expansion at p is determined like in Example 8. Indeed, the relation 
between z and y, in the vicinity of (z, y) = (p, 7), may be put under the form 


(37) p—z=H(y), where H(y) := (5 = on ; 


4t¢ ¢ has maximal period p, then one must restrict n ton = 1 mod p; in that case, there is an extra 
factor of p in the estimate of yn: see Note 15 and Equation (38). 
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the function H(y) in the right hand side being such that H(r) = H'(r) = 0. Thus, 
the dependency between y and z is locally a quadratic one: 


1 1 
p—2= 5H" Oy 7) + HM TH 


When this relation is locally inverted: a square-root appears: 


H"(r) 
2 


The determination with a —,/ should be chosen there as y(z) increases to T~ as z > 


—-/p-z= (y—7) [l+a(y—7) +e2(y—7)? +... 


p_. This implies, by solving with respect to y — 7, the relation 
y—t~ —di(p— 2)? + d3(p— 2) - di(p—z)*? +---, 


where dt = \/2/H"(r) with H”(r) = 7” (T)/@(r)?. The singular expansion at p 
results. 

There now remains to exclude the possibility for y(z) to have singularities other 
than p on the circle |z| = p. Observe that y(p) is well defined (in fact y(p) = 7), so 
that the series representing y(z) converges at p as well as on the whole circle (given 
positivity of the coefficients). If 6(z) is aperiodic, then so is y(z). Consider any point 
¢ such that |¢] = p and ¢ # p and set 7 = y(¢). We then have |7| < 7 (by the 
Daffodil Lemma: Lemma IV.1, p. 253). The function y(z) is analytic at ¢ by virtue of 
the Analytic Inversion Lemma (Lemma IV.2, p. 262) and the property that 


ay 
dy O(Y) |y—n 


(This last property derives from the fact that the numerator of the quantity on the left, 


o(n) — nd"(n) = b0 — ban” — 263° — 3¢an* — +++, 


cannot vanish, by the triangle inequality since |7| < 7.) Thus, under the aperiodicity 
assumption, y(z) is analytic on the circle |z| = p punctured at p. The expansion of 
the coefficients then results from basic singularity analysis. 


Figure 10 provides a table of the most basic varieties of simple trees and the 
corresponding asymptotic estimates. With Theorem VI.6, we now have available a 
powerful method that permits us to analyse not only implicitly defined functions but 
also expressions built upon them. This fact will be put to good use in Chapter VII, 
when analysing a number of parameters associated to simple varieties of trees. 
> VI.13. Computability of singular expansions. Define 


h(w) = T/¢(7) se w/o(w) 


Gow?” 


so that y(z) satisfies \/p — z = (7 — y)h(y). The singular expansion of y can then be deduced 
by Lagrange inversion from the expansion of the negative powers of h(w) at w = rT. This 
technique yields for instance explicit forms for coefficients in the singular expansion of y = 
ze”, 
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Type 


binary 


unary-binary 
general 


Cayley 


FIGURE VI.10. Singularity analysis of some simple varieties of trees. 


> VL14. Stirling’s formula via singularity analysis. The solution to T = ze’ analytic at 0 is 


the Cayley tree function. It satisfies [z”] = n"~' /n! (by Lagrange inversion) and, at the same 
time, its singularity is known from Theorem VI.6. As a consequence: 


eee OO 3 he eso ME gai BD 5 
nl Onn 12 288 51840 


Thus Stirling’s formula also results from singularity analysis. <q 


> VL15. Periodicities. Assume that ¢(u) = (u?)with ~ analytic at 0 and p > 2. Lety = 
y(z) be the root of y = z@(y). Set Z = z” and let Y (Z) be the root of Y = Zw(Y)”. One has 
by construction y(z) = Y(z”)'/”, given that y? = z?6(y)”. Since Y(Z) = ¥1Z+Y2Z7+4---, 
we verify that the nonzero coefficients of y(z) are amongst those of index 1,1+p,1+2p,.... 

If p is chosen maximal, then ¢)(w)? is aperiodic. Then Theorem VI.6 applies to Y (Z): the 
function Y (Z) is analytically continuable beyond its dominant singularity at Z = p?; it has a 
square root singularity at p? and no other singularity on |Z| = p”. Also, since Y = Zw(Y)?, 
the function Y (Z) cannot vanish on |Z| < p?, Z # 0. Thus, Y(Z)1/? is analytic in |Z| < p?, 
except at p? where it has a J branch point. All computations done, we find that 


n 


dip” 
2 7n3 


The argument also shows that y(z) has p conjugate roots on its circle of convergence. (This is 
a kind of Perron-Frobenius property for periodic tree functions.) dq 


(38) [z"]y(z) ~ p- 


when n=1 (mod p). 


> VI.16. Boundary cases I. The case when 7 lies on the boundary of the disc of convergence 
of ¢ may lead to asymptotic estimates differing from the usual p~"n~°/? prototype. Without 
loss of generality, take @ aperiodic to have radius of convergence equal to 1 and assume that } 
is of the form 


(39) o(u) =u+c(1—u)* +0((1—u)*), with l<a<2, 


as u tends to 1 with |u| < 1. The solution of the characteristic equation ¢(T) — T(r) = Ois 
then + = 1. The function y(z) defined by y = z¢(y) is A—analytic (by a mapping argument 
similar to the one exemplified by Figure 9 and related to the fact that ¢ “multiplies” angles 
near 1). The singular expansion of y(z) and the coefficients then satisfy 


-—1/a-1 


(40) y(z) =1-e O12)" 40(0=2)/*) ee mere Tee) 


[The case a = 2 was first observed by Janson [281]. Trees with a € (1, 2) have been investi- 


gated in connection with stable Lévy processes [141]. The singular exponent a = 3 occurs for 
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instance in planar maps (Chapter VII), so that GFs with coefficients of the form po ?n->/ 3 


arise, when considering trees whose nodes are themselves maps. ] <q 


[> VI.17. Boundary cases II. Let ¢(u) be the probability generating function of a random 
variable X with mean equal to 1 and such that dé, ~ An~°~1, with 1 < a < 2. Then, 
by a complex version of an Abelian theorem (see, e.g., [56, §1.7] and [182]), the singular 
expansion (39) holds when u — 1, |u| < 1, within a cone, so that the conclusions of (40) 
hold in that case. Similarly, if ¢"(1) exists, meaning that X has a second moment, then the 
estimate (40) holds with a = 2, and then coincides with what Theorem V1.6 predicts [281]. (In 
probabilistic terms, the condition of Theorem VI.6 is equivalent to postulating the existence of 
exponential moments for the one-generation offspring distribution.) dq 


VI.8. Polylogarithms 


Expressions involving sequences like (,/7) or (logn) can be subjected to sin- 
gularity analysis. The starting point is the definition of the generalized polyloga- 
rithm Lig,r, where a is an arbitrary complex number and r a nonnegative integer: 


n 


‘ es 
Lig r(z) = S (log n) ma 


n>1 


The series converges for |z| < 1, so that the function Li,,, is a priori analytic in 
the unit disc. The quantity Li, 9(z) is the usual logarithm, log(1 — z)~+, hence the 
established name, polylogarithm, assigned to these functions [331]. In what follows, 
we make use of the abbreviation Lig o(z) = Lig(z), so that Lij(z) = Lis o(z) = 
log(1 — z)~1 is the GF of the sequence (1/n). Similarly , Lip; is the GF of the 
sequence (log n) and Li_j/2(z) is the GF of the sequence (,/7). 

Polylogarithms are continuable to the whole of the complex plane slit along the 
ray IR>1, a fact established early in the twentieth century by Ford [220]. They are 
of SA-class [176] and their singular expansions involve the Riemann zeta function 
defined by 


for (s) > 1, and by analytic continuation elsewhere [470]. 

Theorem VI.7 (Singularities of polylogarithms). For alla € Zand r € Zp5o, the 
function Lig,,(z) is analytic in the slit plane C\R>1. Fora ¢ {1,2,...}, there exists 
an infinite singular expansion (with logarithmic terms when as r > QO) given by the 
two rules: 


[oe) 


9 . a) 
Life) ~ Pau! +) Pela — joi, we YO 
(41) gene l=1 


Lig,r(z) 


l 
_ 

| 
ah 
S 
= 


The expansion of Lig is conveniently described by the composition of two expansions, with the 


expansion of w = log z at z = 1, namely w = (1 — z) + 4(1— z)* +---, to be substituted 


VI. 8. POLYLOGARITHMS 391 


inside the expansion into powers of w. The exponents of (1 — z) involved in the resulting 
expansion are {a — l,a,...} U{0,1,...}. Fora < 1, the main asymptotic term of Lig,, is 


Liar ~ (1 — a)(1 — 2)° "2" (2), L(z) := log 


1— 2’ 


while, for a > 1, we have Lig, ~ (a), since the sum converges. 

PROOF. The analysis crucially relies on the Mellin transform (see APPENDIX B: 
Mellin transform, p. 707). We start with the case r = 0 and consider several ways in 
which z may approach the singularity 1. Step (i) below describes the main ingredi- 
ent needed in obtaining the expansion, the subsequent steps being only required for 
justifying it in larger regions of the complex plane. 


(i) When z — 1~ along the real line: Set w = — log z and introduce 
enw 
42 A := Li,(e~”) = ; 
(42) (w) = Lie(e) = )) = 


n>1 


This is a harmonic sum in the sense of Mellin transform theory, so that the Mellin 
transform of A satisfies (R(s) > max(0, 1 — a)) 


(43) A*(s) = | Me wee SOG): 
0 
The function A(w) can be recovered from the inverse Mellin integral, 
1 ct+ioo 
(44) N(w) = 5 / Ca a\T' (a) * ae, 


with c taken in the half-plane in which A*(s) is defined. There are poles at s = 
0,—1, —2,... due to the Gamma factor and a pole at s = 1 — a due to the zeta 
function. Take d to be of the form —m — 4 and smaller than 1 — a. Then, a standard 
residue calculation, taking into account poles to the left of c and based on 


A(w) = ye Res (¢(s + a)I'(s)w*) 
(45) so€{0,-1 Be ata ee 
: | ‘ ¢(s +a)I(s)w* ds, 
d 


217 J a—ico 


s=SO 


then yields a finite form of the estimate (41) of Lig (as w — 0, corresponding to 
z—717). 

(it) When z — 17 inacone of angle less than x inside the unit disc: In that case, 
we observe that the identity in (44) remains valid by analytic continuation, since the 
integral is still convergent (this property owes to the fast decay of ['(s) towards +700). 
Then the residue calculation (45), on which the expansion of A(w) is based in the real 
case w > 0, still makes sense. The extension of the asymptotic expansion of Li, from 
within the unit disc is thus granted. 


(itt) When z tends to 1 vertically’: Details of the proof are given in [176]. What 
is needed is a justification of the validity of expansion (41), when z is allowed to tend 
to 1 from the exterior of the unit disc. The key to the analysis is a Lindelof integral 
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Liye) = Dove" = 3G sony soe + 6(-§) +0 (1 2)*”) 
i) = 2" =>5-1 

Lioa(Z) = Sogn = M4 Ez) | = log V2 +O ((1 — z)L(z)) 
Tiyale)o = La z= (ret) - ivavi-2+0((1-2)*) 
Piet) = So Ae = VRP 5 (4 Ft ev) + 
Eo. 2 Ge = L(z) 

Lio(z) = a = © _ (4) + N0-2)- (GF + 4L@)A-2? + 


FIGURE VI.11. Sample expansions of polylogarithms (L(z) := log(1 — z)~+). 


representation of the polylogarithm (Notes IV.7 and IV.8, p. 225), which provides 
analytic continuation. To wit: 


1 1/2+i00 Ss 


z TT 


Li, (-—z) = -—— —_— — 
?) 207 J1/2-i0o S% Sin TS 


The proof then proceeds with the analysis of the polylogarithm when z = e\”—7) 
and s = 1/2 + it, the integral being estimated asymptotically as a harmonic integral 
(a continuous analogue of harmonic sums [502]) by means of Mellin transforms. The 
extension to a cone with vertex at 1, having a vertical symmetry and angle less than 7, 
then follows by an analytic continuation argument. By unicity of asymptotic expan- 
sions (the horizontal cone of Parts (7) and (iz) and the vertical cone have a nonempty 
intersection), the resulting expansion must coincide with the one calculated explicitly 
in Part (2), above. 


To conclude, regarding the general case r > 0, we may proceed along similar 
lines, with each log n factor introducing a derivative of the Riemann zeta function, 
hence a multiple pole at s = 1. It can then be checked that the resulting expansion 
coincides with what is given by formally differentiating the expansion of Li, a number 
of times equal to r. (See also Note 18 below.) 


Figure 11 provides a table of expansions relative to commonly encountered poly- 
logarithms (the function Lig is also known as a dilogarithm). Example 9 illustrates 
the use of polylogarithms for establishing a class of asymptotic expansions of which 
Stirling’s formula appears as a special case. Further uses of Theorem VI.7 will appear 
in the next sections. 
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EXAMPLE VI.9. Stirling’s formula, polylogarithms, and superfactorials. One has 
> logn!z” = (1— 2)G" Lio,i(z), 
n>1 
to which singularity analysis is applicable. Theorem VI.7 then yields the singular expansion 
L(z)-y | 1-L(z)+y-—1+ log 2a 


OMe gy Mg eg 


l-z 


from which Stirling’s formula reads off: 
logn! ~ nlogn—n+ 5 logn + log Vin ++. ‘ 


(Stirling’s constant log /27 comes out as neatly —¢’(0).) Similarly, define the superfactorial 
function to be 1'2? ---n”. One has 


S 5 log(1?2?---n")2” = Z Li-1,1(z), 


l-z 
n>1 


to which singularity analysis is mechanically applicable. The analogue of Stirling’s formula 
then reads: 


Ty 2: 1 Ti 1,,2 
119? ...n™ ~ Anz” +2767 4” 


A = exp (4 _ cn) = exp (-<2 + mo : 


The constant A is known as the Glaisher-Kinkelin constant [165, p. 135]. Higher order factorials 
can be treated similarly. .......... 0... cece eee eee ee eee END OF EXAMPLE V1.9. 


> VI.18. Polylogarithms of integral index and a general formula. Let a =m € Z>1. Then: 


so pat co UY 2 eet (a1) apt 
Lim (z) = Gioia (log w — Hm-1) + De Fi C(m — jw’, 
J20,jAm—-1 
where H,,, is the harmonic number and w = — log z. [The line of proof is the same as in 


Theorem VI.7, only the residue calculation at s = 1 differs.] The general formula, 


. a or = Sats 
Lia,r(z) ay) ar Zz a i [¢(s + a)I'(s)w *], w := —logz, 
sEZ>o -—a 


holds for all a € C and r € Zo and is amenable to symbolic manipulation. dq 


VI.9. Functional composition 


Let f and g be functions analytic at the origin that have nonnegative coefficients. 
We consider the composition 


h=fog, — h(z) = f(g(z)), 


assuming g(0) = 0. Let py, 4, pn be the corresponding radii of convergence, and let 
Tr = f(prf), and so on. We shall assume that f and g are A-continuable and that 
they admit singular expansions in the scale of powers. There are three cases to be 
distinguished depending on how 7, compares to py. Clearly one has: 
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— Supercritical case, when tT, > py. In that case, when z increases from 0, 
there is a value r strictly less than pg such that g(r) attains the value pr, 
which triggers a singularity of f og. In other words r = pp, = g/—))(pr). 
Around this point, g is analytic and a singular expansion of f 0 g is obtained 
by composing the singular expansion of f with the regular expansion of g 
at r. The singularity type is that of the external function (f). 

— Subcritical case, when T, < pr. In this dual situation, the singularity of fog 
is driven by that of the inside function g. We have p, = pg, Th = f (Pg) 
and the singular expansion of f o g is obtained by composing the regular 
expansion of f with the singular expansion of g at pg. The singularity type 
is that of the internal function (4g). 

— Critical case, when Tg = py. In this boundary case, there is a confluence 
of singularities. We have p;, = Pg, T, = Tf, and the singular expansion is 
obtained by composition rules of the singular expansions. The singularity 
type is a mix of the types of the internal and external functions (f,g). 


This terminology extends the notion of supercritical sequence schema introduced 
in Chapter V, where we considered the case f(z) = (1 — z)~' and discussed some 
of the probabilistic consequences. Rather than stating general conditions that would 
be unwieldy, it is better to discuss examples directly, referring to the above guide- 
lines supplemented by the plain algebra of generalized power expansions, whenever 
necessary. 


EXAMPLE VI.10. “Supertrees”. Let G be the class of general Catalan trees: 


G=ZxSEQKG) = Gz) = 5(1- vI- 4). 


The radius of convergence of G(z) is 4 and the singular value is G(+) = 4. The class ZG 
consists of planted trees, which are such that to the root is attached a stem and an extra node, 
with OGF equal to zG(z). We then introduce two classes of supertrees defined by substitution: 


H = G[ZG] = H(z) = G(2G(z)) 

K=G(Z+2Z')G) = K(z) = G(2zG(z)). 
These are “trees of trees”: the class 1 is formed of trees such that, on each node there is grafted 
a planted tree (by the combinatorial substitution of Section I.6, p. 77); the class K similarly 


corresponds to the case when the stems can be of any two colours. Incidentally, combinatorial 
sum expressions are available for the coefficients, 


a? >) 1 (2k-2\ (2n-3k-1 >> a (2% -2\ (2n-3k—1 
2 n-k\ k-1 WekelT poe n-k\ k-1 n-k—-1 }? 


k=1 


the initial values being given by 
Hse +2? +a 72 4012 +... 5 K(2) S22? +229 4 82°4 182? +642" pes, 


Since pg = + and tg = 3, the composition scheme is subcritical in the case of H and 
critical in the case of K. In the first case, the singularity is of square-root type and one finds 


SThis terminology is in accordance with the notion of supercritical sequence schema in Section V. 4 
(p. 313), for which the external function is f(z) = (1 — z)~1, with pp = 1. 
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as 


FIGURE VI.12. A binary supertree is a “tree of trees”, with component trees all binary. 
The number of binary supertrees with 2n nodes has the unusual asymptotic form 


=5 
cA"n 5/4. 


easily: 
2-v2 1 /1 4” 
H(z) ~ — —=1/--2, o > A, ~ ——. 
zai 4 V8V 4 8V/27n3/2 
In the second case, the two square-roots combine to produce a fourth root: 
1 11 1/4 4” 
K(z) = — —(- —z) o> kn ~ — 
zat2 24 8T'(2)n5/4 


On a similar register, consider the class B of complete binary trees: 


mere a a 2 
B=Z4+ZxBxB => Bi) = 4, 


and define the class of binary supertrees (Figure 12) by 


1-2/1 — 422-14 422 
1—- V1 — 422 , 
ue 


The composition is critical since zB(z) = 5 at the dominant singularity z = i. It is enough to 
consider the reduced function 


S(z) = S(/z) = z+ 27 +329 + 824 4 252° + 802° + 26727 + 911229 +---, 


S=B(ZxB) = S(z)= 


whose coefficients constitute EJS A101490 and occur in Bousquet-Mélou’s study of integrated 
superbrownian excursion [67]. We find 


ra] 1/4 1/2 a 4” v2 1 

S(z) ~ 1-V2(1-42)'/44.(1-42)'/?4---e@ —& Sy = PZ (a — Danae T° ) 
For instance, a seven term expansion yields a relative accuracy of 10~*, already for n = 100, 
so that such approximations are quite usable in practice. 

The occurrence of the exponent —3 in the enumeration of bicoloured and binary supertrees 
is striking. Related constructions have been considered by Kemp [291] who obtained more 
generally exponents of the form —1 — 2~¢ by iterating the substitution construction (this, in 
connection with what he called “multidimensional trees”’). It is significant that asymptotic terms 
of the form n?/? with g ~ 1,2 can appear in elementary combinatorics, even in the context 
of simple algebraic functions. Such exponents tend to be associated with nonstandard limit 
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Weights 


(46) 


Cy Ge 


FIGURE VI.13. Typical weights (top) and triangular arrays (bottom) illustrating the dis- 


cussion of combinatorial sums Sn = S°;_, fk gf. 


laws, akin to the stable distributions of probability theory; see also our discussion at the end of 
Caper Xe Ee steeds AA at fair Bhd Lt Lt cow aa eee adc, END OF EXAMPLE VI.10. 


> VI.19. Supersupertrees. Define these by 
S?l(z) = B(zB(zB(z))). 
We find automatically (with the help of B. Salvy’s program) 


=a 
[22r+1] II (2) . 9729/4p (Z) A298, 


ree , : 1-2-4 : bea 
and further extensions involving an asymptotic term n~1~? “ are possible (see [291] for similar 


cases). <J 


Combinatorial sums. Singularity analysis permits us to discuss the asymptotic 
behaviour of entire classes of combinatorial sums at a fair level of generality, with 
asymptotic estimates coming out rather automatically. We consider here combinatorial 
sums of the form 


Se TRON 
k=0 


where f;, is a sequence of numbers, usually of a simple form and called the weights, 
while the g? are a triangular array of numbers, for instance Pascal’s triangle. 

As weights f; we shall consider sequences such that f(z) is A—analytic with a 
singular expansion involving functions of the standard scale of Theorems VI.1, VI.2, 
VI.3. Typical examples for f(z) and (f,) are® are displayed in Figure 13, Equation 46. 
The triangular arrays discussed here are taken to be coefficients of the powers of some 
fixed function, namely, 


gh = [2"\(g(z))* where g(z) = Sgn”, 


with g(z) an analytic function at the origin having nonnegative coefficients and satis- 
fying g(0) = 0. Examples are given in Figure 13, Equation (47). An interesting class 


Weights like log k and Vk, also satisfy these conditions, as seen in Section VI. 8. 
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of such arrays arises from the Lagrange inversion theorem. Indeed, if g(z) is implic- 
itly defined by g(z) = zG(g(z)), one has gn, = £[w"—*]G(w)”; the last three cases 


of (47) are obtained in this way (by taking G(w) as 1/(1 — w), (1 + w)?, e”). 
By design, the generating function of the S,, is simply 


S(z) = 7 Sn2" = f(g(z)) with f(z) = Do fz". 
n=0 k=0 


Consequently, the asymptotic analysis of S,, results by inspection from the way sin- 
gularities of f(z) and g(z) get transformed by composition. 


EXAMPLE VI.11. Bernoulli sums. Let ¢ be a function from Z>o to R and write f, := $(k). 


Consider the sums 
“ 1 fn 
s.=Fwot(s) 


k=0 
If X;, is a binomial random variable’, X, € Bin(n, 4), then S;, = E(¢(Xn)) is exactly the 
expectation of ¢(X,). Then, by the binomial theorem, the OGF of the sequence (.S;, ) is: 


s)= 5 t(s4). 


Considering weights whose generating function has, like in (46) radius of convergence 1, what 
we have is a variant of the composition schema, with an additional prefactor. The composition 
scheme is of the supercritical type since the function g(z) = z/(2 — z), which has radius of 
convergence equal to 2, satisfies Tz = oo. The singularities of S(z) are then of the same type as 
those of the weight generating function f(z) and one verifies, in all cases of (46), that, to first 
asymptotic order, S;, ~ $(n/2): this is in agreement with the fact that the binomial distribution 
is concentrated near its mean }. Singularity analysis provides complete asymptotic expansions, 
for instance, 


1 2 2 6 24 
E(— | X, Ba Va Ate 
(=| >0) te he tO. ”) 
E (Hx,) 2 ee are 4 Om) 
a — MBIT YT On Tan? 
See [163, 176] for more along these lines. ................... END OF EXAMPLE VI.11. 


EXAMPLE VI.12. Generalized Knuth-Ramanujan (-functions. For reasons motivated by 
analysis of algorithms, Knuth has encountered repeatedly sums of the form 
Qn({fe}) = fot it + poe) ce pee + 
n n n 

(See, e.g., [312, pp. 305-307].) There (fz) is a sequence of coefficients (usually of at most 
polynomial growth). For instance, the case f;, = 1 yields the expected time till the first collision 
in the birthday paradox problem (Section II. 3, p. 105). 

A closer examination shows that the analysis of such Q,, is reducible to singularity analy- 
sis. Writing 


mr—k-l 


Onli} = fot hk 


os (n —k)! 


7A binomial random variable is a sum of Bernoulli random variables: X i isi Y; where the Y; 
are independent and distributed like a Bernoulli variable Y, with PCY = 1) =p, P(Y =0) =q=1-p. 
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reveals the closeness with the last column of (47). Indeed, setting 


one has (n > 1) 
Qn = fo + Ht [2"|S(2) where S(z) = F(T(z)), 


and T(z) is the Cayley tree function (T = ze’). 

For weights f,, = 6(k) of polynomial growth, the schema is critical. Then, the singular 
expansion of S is obtained by composing the singular expansion of f with the expansion of T, 
namely, T(z) ~ 1— V/2\/1 — ezas z > e7'. For instance, if ¢(k) = k” for some integer r > 
1 then F(z) has an rth order pole at z = 1. Then, the singularity type of F(T(z)) is Z~"/? 
where Z = (1 — ez), which is reflected by Si, < e"n"/?—1 (we use ‘x’ to represent order- 


of-growth information, disregarding multiplicative constants). After the final normalization, we 
see that Qn x n't)/?, Globally, for many weights of the form f;, = $(k), we expect Qn to 
be of the form \/nd(./7), in accordance with the fact that the expectation of the first collision 
in the birthday problem is on average near Jan/2. sis Sa alas END OF EXAMPLE VI.12. 


> VI.20. General Bernoulli sums. Let Xn € Bin(n;p) be a binomial random variable with 
general parameters p, q: 


n ae 
P(Xn =k) = (7) *,  qg=l-p. 
Then with f;, = $(k), one has 


BOX) =o (-E), 


so that the analysis develops as in the case Bin(n; 3). <J 


> VI.21. Higher moments of the birthday problem. Take the model where there are n days 
in the year and let B be the random variable representing the first birthday collision. Then 


Pn(B > k) = kln-* (2), and 


En(®(B)) = ®(1) + Qn({A®(k)}), where A®(k) :-= O(k +1) — Ok). 


For instance E,(B) = 1+ Q,((1,1,...)). We thus get moments of various functionals (here 
stated to two asymptotic terms): 


In+2 3/28 — 
via singularity analysis. <q 


> VI.22. How to weigh an urn? The “shake-and-paint” algorithm. You are given an urn 
containing an unknown number JN of identical looking balls. How to estimate this number in 
much fewer than O(V) operations? A probabilistic solution due to Brassard and Bratley [71] 
uses a brush and paint. Shake the urn, pull out a ball, then mark it with paint and replace it 
into the urn. Repeat until you find an already painted ball. Let X be the number of operations. 
One has E(X) ~ \/7N/2. Further more the quantity Y := X?/2 constitutes, by the previous 
note, an asymptotically unbiased estimator of N, in the sense that E(Y) ~ N. In other words, 
count the time till an already painted ball is first found, and return half of the square of this time. 
One also has \/V(Y) ~ N. By performing the experiment m times (using m different colours 
of paint) and by taking the arithmetic average of the m estimates, one obtains an unbiased 
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estimator whose typical accuracy is \/1/m. For instance, m = 16 gives an expected accuracy 
of 25%. (Similar principles are used in the design of data mining algorithms.) dq 


> VI.23. Catalan sums. These are defined by 


ee: Gar s(2) = phy (EYEE), 


V1 — 4z 2z 
k>0 
The case when py = 1 corresponds to a critical composition, which can be discussed much in 
the same way as Ramanujan sums. dq 


Schemas. Singularity analysis also enables us to discuss at a fair level of gener- 
ality the behaviour of schemas, in a way that parallels the discussion of the sequence 
schema, based on a meromorphic analysis (Section V.4, p. 313). We illustrate this 
point here by means of the supercritical cycle schema. Deeper examples relative to 
recursively defined structures are developed in Chapter VII. 


EXAMPLE VI.13. Supercritical cycle schema. The schema 7 = CyYC(G) forms labelled 
cycles from basic components in G: 


1 
H = Cyc(G) = H(z) =Ver= au) 
Consider the case where G attains the value 1 before becoming singular, that is, Tg > 
1. This corresponds to a supercritical composition schema, which can be discussed in a way 
that closely parallels the supercritical sequence schema (Section V.4, p. 313): a logarithmic 
singularity replaces a polar singularity. 
Let o := pu, which is determined by G(o) = 1. First, one finds: 


H(z) ~ lo 


x N08 ag — loeloG'(@)) + Ale) 


where A(z) is analytic at z = o. Thus: 


(The error term implicit in this estimate is exponentially small). 

The BGF H(z, u) = log(1 — uG(z))~* has the variable u marking the number of com- 
ponents in #{-objects. In particular, the mean number of components in a random H-object of 
size n is ~ An, where X = 1/(oG"(c)), and the distribution is concentrated around its mean. 
Similarly, the mean number of components with size & in a random H,, object is found to be 
asymptotic to Agno”, where gr = [2*]G(z). ... 0.0... e eee END OF EXAMPLE VI.13. 


VI.10. Closure properties 


At this stage, we have available composition rules for singular expansions under 
operations like +, x, +. These are induced by corresponding rules for extended 
formal power series, where generalized exponents and logarithmic factors are allowed. 
Also, from Section VI. 7, inversion of analytic functions normally gives rise to square- 
root singularities and, from Section VI. 9, functions amenable to singularity analysis 
are essentially closed under composition. 
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In this section® we show that functions of singularity analysis class satisfy explicit 
closure properties under differentiation, integration, and Hadamard product. In order 
to keep the developments simple, we shall mostly restrict attention to functions that 
are A—analytic and admit a simple singular expansion of the form 


J 


(48) f= dl g(1-2)% + 0((1- z)4), 


j=0 
or a simple singular expansion with logarithmic terms 


1 
ae 


J 
49) f(z) = doe (LE) Q-z)% +O((L-2)4), — E(z) = log 
j=0 


where each c; is a polynomial. These are the most frequently occurring in applica- 
tions, and the proof techniques are easily extended to deal with more general situation. 

Subsection VI. 10.1 treats differentiation and integration; Subsection VI. 10.2 pre- 
sents the closure of functions that admit simple expansions under Hadamard prod- 
uct. Finally, Subsection VI. 10.3, concludes with an examination of several interesting 
classes of tree recurrences, where all the closure properties previously established are 
put to use in order to quantify precisely the asymptotic behaviour of recurrences that 
are attached to tree models. 


VI.10.1. Differentiation and integration. Functions of singularity analysis class 
are closed under differentiation, this is in sharp contrast with real analysis. In the sim- 
ple cases” of (48) and (49), closure under integration is also satisfied. The general 
principle (Theorems VI.8 and VI.9 below) is the following: Derivatives and primi- 
tives of functions that are of singularity analysis class admit singular expansions that 
can be obtained term by term via formal differentiation and integration. 

The following statement is a version tuned to our needs of well-known differen- 
tiability properties of complex asymptotic expansions (see, e.g., Olver’s book [381, 
p. 9). 

Theorem VI.8 (Singular differentiation). Let f(z) be A-analytic with a singular ex- 
pansion near its singularity of the simple form 


Then, for each integer r > 0, the derivative f(z) is A-analytic. The expansion of 
the derivative at the singularity is obtained through term-by-term differentiation: 


dz” 


8This section contains supplementary material that may be omitted on a first reading. The contents 
are liberally borrowed from an article of Fill, Flajolet, and Kapur [163]. 

Ft would be possible but unwieldy to treat a larger class, which would have to include arbitrarily 
nested logarithms, since, for instance, fda/x = logz, f dx/(x log x) = log log x, and so on. 
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radius 


K|1l-z| 


FIGURE VI.14. The geometry of the contour y(z) used in the proof of the differentia- 
tion theorem. 


PROOF. Clearly, all that is required is to establish the effect of differentiation on error 
terms, which is expressed symbolically as 


£0(1~2)4) = O((1- 24-1), 


By bootstrapping, only the case of a single differentiation (r = 1) needs to be consid- 
ered. 

Let g(z) be a function that is regular in a domain A(¢, 7) where it is assumed to 
satisfy g(z) = O((1 — z)4) for z € A. Choose a subdomain A’ := A(¢’, 7’), where 
o<¢' < $and0 <7’ <1». By elementary geometry, for a sufficiently small « > 0, 
the disc of radius «|z — 1| centered at a value z € A’ lies entirely in A; see Figure 14. 
We fix such a small value « and let +(z) represent the boundary of that disc oriented 
positively. 

The starting point is Cauchy’s integral formula 

; 1 dw 
(50) 2) = 55 | awe 
a direct consequence of the residue theorem. Here C' should encircle z while lying 
inside the domain of regularity of g, and we opt for the choice C = y(z). Then trivial 
bounds applied to (50) give 
lg(2)| = O (Ir) (= 2)411 - 217?) 
= O(\l- ale) . 
The estimate involves the length of the contour, |y(z)|], which is O(1 — z) by con- 
struction, as well as the bound on g itself, which is O((1 — z)“) since all points of the 
contour are themselves at a distance exactly of the order of |1 — z| from 1. 
> VI.24. Differentiation and logarithms. Let g(z) satisfy 


g(z) =O ((1-2)4L(2)*), — L(z) = log 
for k € Z>o. Then, one has 
d” 2 
9(2) =O ((1- 2) L(@)*). 
z 
(The proof follows along the lines of Theorem V1.8.) <q 


T= 2” 
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It is well known that integration of asymptotic expansions is usually easier than 
differentiation. Here is a statement custom-tailored to our needs. 


Theorem VI.9 (Singular integration). Let f(z) be A-analytic and admit a A-expansion 
near its singularity of the form 


pA 
f(z) = ys CL 2)? +O = 2)"). 


S 
ll 
j=) 


Then f f(t) dt is A—analytic. Assume further hat none of the quantities a; and A 
equals —1. 
(i) If A < —1, then the singular expansion of { f is 


J 


(51) i f(t) dt = -»— Cj Ge err} Ant) (1 = z+) 


j=0 aj+1 


(ii) If A > —1, then the singular expansion of J f is 


S Ji 
eo Cj — ,)ajt+l _ ,yA+1 
(52) i; f(t)dt = Pray z) +LIyo+0((1-2z)4*), 


where the “integration constant” Lo has the value 


Lo := DS + | [f(@) - Se cj(1— t)%] dt. 


aj<-l1 


The case where either some a; or A is —1 is easily treated by the additional rules 


[ a-otae= 10), [ O((1 — t)~*) dt = O(L(z)). 
0 0 


that are consistent with elementary integration, and similar rules are easily derived for powers 
of logarithms. Furthermore, the corresponding O-transfers hold true. (The proofs are simple 
modifications of the one given below for the basic case.) 

PROOF. The basic technique consists in integrating, term by term, the singular expan- 
sion of f. We let r(z) be the remainder term in the expansion of f, that is, 


By assumption, throughout the A-domain one has, for some positive constant K, 
Ir(z)| < K|1 — 2/4. 


(i) Case A < —1. Straight-line integration between 0 and z, provides (51), as 
soon as it has been established that 


[ r(t) dt = O(|l1- gee) ; 
0 
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FIGURE VI.15. The contour used in the proof of the integration theorem. 


By Cauchy’s integral formula, we can choose any path of integration that stays within 
the region of analyticity of r. We choose the contour y := 71 U2, shown in Figure 15. 


Then, one has 
[real < i r(tyat| + i r(tyat 
Y V1 y2 


<k | jl —t|4 dt] +. K | [1 — tA} |de| 
V1 


2 


= O(1- 2|4*). 


where the symbol |dt| designates the differential line-length element in the corre- 
sponding curvilinear integral. Both integrals are O(|1 — z|4+1): for the integral 
along 71, this results from explicitly carrying out the integration; for the integral along 
72, this results from the trivial bound O(||72||(1 — z)“). 

(ii) Case A > —1. We let f_(z) represent the “divergence part” of f that gives 
rise to nonintegrability: 


Fae a= S- ej;(1- z)™. 


aj<-l1 


Then with the decomposition f = [f — f_] + f_, integrations can be performed 
separately. First, one finds 


[ -wa=- y> —Ha-aetts SO 


a;+1 
aj<-l1 a0 


404 VI. SINGULARITY ANALYSIS OF GF’S 


i Be : 1 F 
Next, observe that the asymptotic condition guarantees the existence of qf applied to 
[f — f_], so that 


| (F(t) — f-()] at = : [F(t) — f_(8)] dt + i) (f(t) — f(b) at. 


The first of these two integrals is a constant that contributes to Lo. As to the second 
integral, term-by-term integration yields 


[ t@-fola=- > wa a= ayer f r(tyat 


a; +1 
aj>-1 go 


The remainder integral is finite, given the growth condition on the remainder term, 
and, upon carrying out the integration along the rectilinear segment joining 1 to z, 
trivial bounds show that it is indeed O(|1 — z|4*"). 


VI. 10.2. Hadamard Products. The Hadamard product of two functions f(z) 
and g(z) analytic at the origin is defined as their term-by-term product, 


(53) f(z)©9(z) = > fngn2”, where f(z)=S0 faz", 9(2) = D0 one”. 


n>0 n=0 n=0 


We are going to see, following an article of Fill, Flajolet, and Kapur [163], that simple 
functions of singularity analysis class are closed under Hadamard product. Establish- 
ing such a closure property requires methods for composing functions from the basic 
scale, namely (1 — z)*, as well as error terms of the form O((1 — z)“). We address 
each problem in turn. 


The expansion around the origin, 


- —a)(— 1 
(54) (1 zytai¢st,4 Cate, 
gives through term-by-term multiplication 
(55) (1—z)* © (1-2)? =F [-a, —8; 1; 2]. 


Here 2f' represents the classical hypergeometric function of Gauss defined by 


az , aat1)A(e+)2 
56 Pilopbiae)| =i a 

From their transformation theory, see for instance [492, Ch XIV], hypergeometric 
functions can generally be expanded in the vicinity of z = 1 by means of the z  1—z 
transformation. Instantiation of this transformation with 7 = 1 yields 


(57) 2Fila, 6:1; 2] = Wo Fila, 83a + B;1—- 2] 
T(a+6—1) 


rare el a,1—8;2-—a—6;1- 2]. 


From there, we can state: 
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Theorem VI.10 (Hadamard Composition). When neither of a, b, a + 6 is an integer, 
the Hadamard product (1 — z)* © (1 — z)° has an infinite A-expansion with exponent 
MapiAync ua uree Sines 

(1— (Vaz) 


(1—z)*@ bine NS it (ab) ( a ar 


k>0 k>0 


where the coefficients » and 1 are given by 


yon _ T+ e+b) (-a)F(—0)F apy _ P(-a-b-1) +a) +0)F 
P1+a)P(1+6) (—a—b)F ’ Me T(-a)P(-b) (2+a+bF 


Here x* is defined for k € Z>0o by a® := 2(a@+1)---(e+k—1). 
> VI.25. Special cases. The case where either a or b is an integer poses no difficulty, since, for 
m € Zso, the function (1 — z)” © g(z) is a polynomial, while, (1 — z)~™ © g(z) is reducible 
to a derivative of g, to which the Singular Differentiation Theorem can be applied. 
The case a + 6 € Z needs transformation formule that extend (57): the principles (based 
on a Lindel6f integral representation and developed by Barnes) are described in [492, 814.53], 
while the formule appear explicitly in [2, pp. 559-560]. <q 
> VI.26. Simple expansions with logarithmic terms. The technique of differentiation with 
respect to a parameter, 
ae ss ) bs oy = 
[(a - 2)-*L@)] 0 (1-2) = F [(-2) 01-2)", 


makes it possible to derive explicit composition rules for expansions involving logarithmic 
terms. 

Next, we address the Hadamard composition of error terms in singular expan- 

sions. The way Hadamard products preserve A—analyticity and compose error terms 
is summarized by the following statement. 
Theorem VI.11 (Hadamard closure). (i) Assume that f(z) and g(z) are A-analytic 
in A(wWo, 7). Then the Hadamard product ( f © g)(z) is analytic in a (possibly smaller) 
A-domain, call it A’. 

(it) Assume further that 


f(z) = O((L— 2)*) and g(z) = O((1—2)”),  z € Ao, 1). 
Then the Hadamard product (f © g)(z) admits in A’ an expansion given by the fol- 


lowing rules: 
—Ifa+b+1 <0, then 


(f © g)(z) = O((1 — 2)*F°*?). 
—Ifk<a+b+1<k+1, for some integer k € Z>_1, then 
ay 
(f © 9)(z) =o 


—— (fog) (Az) +O (l= 22). 
; J: 
j=0 
—Ifa+b+1is anonnegative integer, then (with L(z) = log(1 — z)~+) 
k _4)i 
(og = 


j=0 


(fog) (A -z)f +0 ((1- 2)tP1Z(2)). 
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PROOF. (Sketch) The starting point is an important formula due to Hadamard that 
expresses Hadamard products as a contour integral: 


(58) F(2) © 9(2) = = aa [Fo Fwy (=) &. 


The contour + in the w-plane should be chosen such that both factors, f(w) and 
g(z/w) are analytic. In other words, given the domain A in which both f and g 
are analytic, one should have y C AN (zA~?). 

In the first case (a + b+ 1 < 0), the precise geometry of a feasible contour + 
is described in [163], the principles being similar to those employed in the construc- 
tion of Hankel contours elsewhere in this chapter. The integral giving the value of the 
Hadamard product is finally estimated trivially, based on the order of growth assump- 
tions on f and g, as z — 1. This approach extends to the case a+ b+ 1 = 0, where a 
logarithmic factor comes in, 

For the remaining cases, the easy identity 


d 
ar4(f © g) = (8° fF) © (079), where J= oe 


reduces the analysis to the situation where a + b+ 1 < 0. It suffices to differenti- 
ate sufficiently many times and finally integrate back, as permitted by the Singular 
Integration Theorem. 


Globally, Theorems VI.10 and VI.11 establish the closure under Hadamard prod- 
ucts of functions amenable to singularity analysis in the sense of (48). In practice, 
in order to derive the singular expansion of a function at a singularity, one may con- 
veniently appeal to the Zigzag Algorithm described in Figure 16, whose validity is 
ensured by the a priori knowledge of the existence of an expansion guaranteed by 
Theorems VI.10 and VI.11. A typical application of this algorithms appears in Equa- 
tions (61) and (62) below, in the context of Pélya’s drunkard problem. 


EXAMPLE VI.14.  Péolya’s drunkard problem. (This example is taken from [163].) In the 
d—dimensional lattice Z% of points with integer coordinates, the drunkard performs a random 
walk starting from the origin with steps in {—1, +1}4, each taken with equal likelihood. The 
probability that the drunkard is back at the origin after 2n steps is 


d 
(d) _ i 2n 
° so=((2))’ 


since the walk is a product d independent 1-dimensional walks. The probability that 2n is the 
epoch of the first return to the origin is the quantity po, which is determined implicitly by 


(60) (1-Seoie-r) = oat” Ss 
n=1 


as results from the convolution equations expressing the decomposition of loops into primitive 
loops. In terms of the associated ordinary generating functions P and Q, this relation thus reads 
as (1— P(z))"! = Q(z). 

The asymptotic analysis of the qg,’s is straightforward; the one of the p,’s is more in- 
volved and is of interest in connection with recurrence and transience of the random walk; see, 
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Let f(z) and g(z) be A—analytic and admit simple singular expansions of the form (48) or (49). 
What is sought is the singular expansion of 


h(z) = f(z) © gl). 


Step 1. Determine the asymptotic expansions f, = [z”]f(z) and gn = [z"]g(z) induced by 
the singular expansions of f and g in accordance with the singularity analysis process. Given 
finite singular expansions of f and g, the order C’ of the error in the expansion of h is known a 
priori by Theorem VI.11. 


Step 2. Deduce from Step 1 an asymptotic expansion of h, = [z"]h(z) by usual multiplication 
from the expansions of f, and gn. 


Step 3. Reconstruct by singularity analysis a function H(z) that is singular at 1 and is such that 
[2"]H(z) ~ [z"]h(z). 

This can be done by using the expansions of basic functions, as provided by Theorems VI.1 

and VI.2 in the reverse direction. By construction, H(z) is a sum of functions of the form 

(1 — z)°L(z)*, which are all singular at 1. 

Step 4. Output the singular expansion of f © g as 


h(z) = H(z) + P(z) 


where P is a polynomial of degree 6, which is the largest integer < C’. The polynomial P(z) 
is needed, since polynomials (and more generally functions analytic at 1) do not leave a trace 
in asymptotic expansions of coefficients. Since h(z) — H(z) is 6 times differentiable at 1, one 
must take 


) (h(z) — H(z), (1-2). 


FIGURE VI.16. The Zigzag algorithm for computing singular expansions of Hadamard 
products. 


e.g., [133, 329]. The Hadamard closure theorem provides a direct access to this problem. Define 


(Zz) = ye = 2) = = = 


n>0 


Then, Equations (59) and (60) imply: 
1 
A(z)? 


The singularities of P(z) are found to be as follows. 


P(z)=1- where A(z)©% := Nz) ©--- © A(z) (d times). 


Case d = 1: No Hadamard product is involved and 


P(z)=1-VI=2, implying p() = ee) . we 


n22r-1\ n—-1 Tn 


(This agrees with the classical combinatorial solution expressed in terms of Catalan numbers.) 


Case d = 2: By the Hadamard closure theorem, the function Q(z) = A(z) © A(z) admits 
a priori a singular expansion at z = 1 that is composed solely of elements of the form (1 — z)* 
possibly multiplied by integral powers of the logarithmic function L(z). From a computational 
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standpoint (cf. the Zigzag Algorithm), it is then best to start from the coefficients themselves, 


1 1 2 4a 1 
1 Qe _ eek das ede oe B® oi 
(61) aa CY ae al a\n ane) 


and reconstruct the only singular expansion that is compatible, namely 


(62) Q(2) = <2) + K +O((1- 2)", 


where € > 0 is an arbitrarily small constant and K is fully determined as the limit as z — 
1 of Q(z) — ~'L(z). Then it can be seen that the function P is A-continuable. (Proof: 
Otherwise, there would be complex poles arising from zeros of the function Q on the unit disc, 


and this would entail in p? the presence of terms oscillating around 0, a fact that contradicts 
the necessary positivity of probabilities.) The singular expansion of P(z) at z = 1 results 
immediately from that of Q(z): 
2 
T nwiiK 
Plo \nk Te a ee 
O~ 1" Tay * yt 
so that, by Theorems VI.2 and V1.3, one has 
K 1 
p?) — = - 9 LE + 0( z ) 
nlog* n n log n, nlog* n 
= se] Ont 1 
K = al 16°" -— 


= 0.8825424006106063735858257 . 


(See the study by Louchard et al. [345, Sec. 4] for somewhat similar calculations.) 


Case d = 3: This case is easy since Q(z) remains finite at its singularity z = 1 where it 
admits an expansion in powers of (1 — z)V/ ?| to the effect that 


a BN aly Vo cE 
In 7m 8 /nne 73/2 \ 3/2 Bn5/2 i 


The function Q(z) is a priori A-continuable and its singular expansion can be reconstructed 
from the form of coefficients: 


Qe) _~, Ql) — 2VT=2 + O(1 - 21), 
leading to 


1 2 
Pi) = (: = an ~ ag VI=F + Olt =a). 


By singularity analysis, the last expansion gives 


Be ! 1 ma 
Pn = 13/2Q2(1) n3/2 +O (4) 
Q(1) = —_ = 1.3932039296856768591842463. 
l (3) 
A complete asymptotic expansion in powers n—°/?,n—5/?, ... can be obtained by the same de- 


vices. In particular this improves the error term above to O(n~°/”). The explicit form of Q(1) 
results from its expression as the generalized hypergeometric 3F2[5, 3, 3; 1,1; 1], which eval- 
uates by Clausen’s theorem and Kummer’s identity to the square of a complete elliptic integral. 
(See the papers by Larry Glasser for context, for instance [236]; nowadays, several computer 


algebra systems even provide this value automatically.) 
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Higher dimensions are treated similarly, with logarithmic terms surfacing in asymptotic 
expansions for all even dimensions. ................00e cece ee END OF EXAMPLE VI.14. 


VI. 10.3. Tree recurrences. To conclude with singularity analysis theory, we 
present the general framework of tree recurrences, also known as probabilistic divide- 
and-conquer recurrences, which are recurrences of the general form 


(63) fr=tnt > paalfet+ fron), (n> No). 
k 


There, (f,,) is the sequence implicitly determined by the recurrence, assuming known 
initial conditions fo,..., fng—1; the sequence (t,,) is known as the sequence of tolls; 
the array (pp,x) is a triangular array of numbers that are probabilities in the sense that, 
for each fixed n > 0, one has Vx Pn,k = 1; the number a is a small fixed integer 
(usually 0 or 1). 

The interpretation of the recurrence is in the form of a splitting process : a collec- 
tion of n elements is given; a number a of these is put aside and the rest is partitioned 
into two subgroups, a “left” subgroup of cardinality K,, and a “right” subgroup of 
cardinality n — a — K,,. The quantity K,, is a random variable with probability distri- 
bution 

P(K, => k) = Pn,k+ 
The splitting is repeated recursively till only groups of size less than the threshold 
mo are obtained. Assuming stochastic independence of all the random variables Ir 
involved, it is seen that /,, represents the expectation of the (total) cost C,, of arandom 
(recursive) splitting, when a single stage involving n elements incurs a toll equal to ty. 
In symbols: 


fn = E(C,), Cy = tn + Ck, + OR ae on 

Clearly, a particular realization of the splitting process can be represented by a 
binary tree. With a suitable choice of probabilities, such processes can be used to anal- 
yse cost functional of increasing binary trees, and binary Catalan trees, for instance. 
A prime motivation is the analysis of divide-and-conquer algorithms in computer sci- 
ence, like quicksort, mergesort, union-find algorithms, and so on [101, 208, 433]. Our 
treatment once more follows the article [163]. 

A general approach to the asymptotic solution of a tree recurrence goes as fol- 
lows. First, introduce generating functions, 


f~oy= S- frwnz”, t(z) = Se 


for some normalization sequences (w,,) and (w},) that are problem-specific. (So, wy, = 
1 gives rise to an OGF, w, = 1/n! to an EGE, with other normalizations being also 
useful.) Then, by linearity of the original recurrence, there exists a linear operator £ 
on series (and functions), such that 


f(z) = £[t@)}. 
Provided the splitting probabilities p,,; have expressions of a tractable form, it is rea- 


sonable to attempt expressing £ in terms of the usual operations of analysis. One may 
then investigate the way £ affects singularities and deduce the asymptotic form of the 
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cost sequence (f,,) from the singularities of its generating function, f(z). An inter- 
esting feature of this approach is to allow for a powerful discussion of the relationship 
between tolls and induced costs, in a way that parallels composition of singularities 
in Section VI.9. Closure properties discussed earlier in this section are naturally a 
crucial ingredient in the intervening singularity analysis process. 

The three examples that we present combine closure properties with the singu- 
larity analysis of polylogarithms of Section VI.8. Example 15 is relative to increas- 
ing binary trees and binary search trees (Example II.17, p. 132). Example 16 dis- 
cusses additive costs of random binary Catalan trees in the perspective of tree recur- 
rences. Finally, Example 17 shows the applicability of singularity analysis to a basic 
coalescence-fragmentation process. 


EXAMPLE VI.15. The binary search tree recurrence. One of the simplest random tree models 
is defined as follows: a random binary tree of size n > 1 is obtained by taking a root and 
appending to it a left subtree of size K,, and a right subtree of size n — 1 — Ky, where Ky, is 
uniformly distributed over the set of permissible values {0,1,..., — 1}. A tree of size 0 is 
the empty tree. In earlier notations, this process corresponds to 


1 
Dnk = P(Kn =k) =-, 0<k<n-1. 
n 
The associated tree recurrence is then 


n-1 
2 
fr=tr+— >) fe, fo =to, 


k=0 
which translates for OGFs, 


fe= fe", 1) = Dine”, 


n>0 n>0 


into a linear integral equation: 
‘ dw 
(64) fay =1) +2 fw) 
0 —w 
Differentiation yields the ordinary differential equation 
2 
f(z) =t(2)+7—/@), f(0) = to, 


which is then solved by the variation-of-constants method. In this way, it is found that an 
integral transform expresses the relation between the GF of tolls and the GF of total costs. 
Assuming without loss of generality to = 0, we have (with 0, = +) 


(65) f(z) = L[t(2)], where L[t(z)] = 7 if (Owt(w)) (1 — w)? dw. 


Simple toll sequences that admit generating functions of a simple form can then be em- 
ployed to build a repertoire'® that already provides useful indications on the relations between 
the orders of growth of (tn) and (fn). For instance, we find, for the rising-factorial tolls 


= (eal) (Zz) S—2) 4, 


3 (2)--] 


The repertoire approach is developed in an attractive manner by Green and Knuth in [250]. 


a-1l 
atl 


a a-l 


P@)=S[a-ay-a-2a"], f= 
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Tolls (tn) Costs (fn) 
a+r a a-l 
th = - 2 ‘ia +O 
n <a) f aie (ay) 
a+ 
n=ne 1 2 _ ao 
n <a < 2) fi =a (n) 
n+a a-—l n+a a+1l n® 
a wet) a+l1 ( Q ne a—1T(a+1) 
n+a l—-a-1l n+a l+a 
Ei 1 ——— 1- 
a w=) l+a ia ( a tae" 
th =n 0<a<1) Kan+ O(n") 
tn =logn Kon — logn + O(1) 


FIGURE VI.17. Tolls and costs for the binary search tree recurrence, assuming to = 0. 


for a ~ 1, while a = 1 corresponding to tt =n-+ 1 leads to 


Pe) = Gaprleepag = 204+ 1)(Hayr — 1) = 2nlogn + O(n), 


with H,, a harmonic number. The emergence of an extra logarithmic factor for a = 1 is to 
be noted: it corresponds to the fact that path length in a binary search tree or an increasing 
binary tree of size n is ~ 2nlogn. These elementary techniques provide a first set of entries 
recapitulated in Figure 17. 

Singularity analysis furthermore permits us to develop a complete asymptotic expansion 
for tolls of the form ./n, logn, and many others. Consider for instance the toll t] = n°, 
for which the generating function, a polylogarithm, is known to admit a singular expansions in 
terms of elements of the form (1 — z)°, with the main term corresponding to @ = —a — 1 
when a > —1 (Theorem VI.7). The £ transformation reads as a succession of operations, 
” differentiate, multiply by (1 — z)*, integrate, multiply by (1 — z)~°”, which are covered by 
Theorems VI.8 and VI.9. The chain on any particular element starts as 


x(1=2)? 


e(1—2z)® 9s e@(1—z)P eG(1 — 28+, 


at which stage integration intervenes. According to Theorem VI.9, assuming G #4 —2 and 
ignoring integration constants, integration gives 


gsi oS B B42 x(-z)7? B B 
cB(1 — z) ois “Opa *) — —c (1—z)”. 


Thus, the singular element (1 — z)° corresponds to a contribution 


which is of order O(n~°~*). It can be verified that this chain of operations suffices to determine 
the leading order of f, when t, = n% anda > 1. 

The derivation above is representative of the main lines of the analysis, but it has left 
aside the determination of integration constants, which play a dominant réle when t, = n® 
and a < 1 (because a term of the form K/(1 — z)* then dominates in f(z)). Introduce, in 
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accordance with the statement of the Singular Integration Theorem (Theorem VI.9) the quantity 


Kit] := We [e!(w)(1 = w)? = (t'(w)(1— w)?)_| dw, 


0 
where f_ represents the sum of singular terms of exponent < —1 in the singular expansion of 
f(z). Then, for tn = n™ with 0 < a < 1, taking into account the integration constant (which 
gets multiplied by (1 — z)~°, given the shape of £), we find for a < 1: 


Qa 


= n 
n~ Kan, Ka = K[Li-a] = 2 —_——__—.. 
f s (n+ 1)(n +2) 
Similarly, the toll t, = log n gives rise to 


1 
fn~ Kon, Ko = 207 ce: ESNCE a ay = 12035649167. 


This last estimate quantifies the entropy of the distribution of binary search trees, which is stud- 
ied by Fill in [164], and discussed in the reference book by Cover and Thomas on information 
theory [103, p. 74-76]. 2.0... 0 cece eee eee END OF EXAMPLE VI.15. 


EXAMPLE VI.16. The binary tree recurrence. Consider a procedure that on a binary tree 
performs some calculation (without affecting the tree itself) at a cost of tn, then recursively 
calls itself on the left and right subtrees. If the binary tree to which the procedure is applied 
is drawn uniformly amongst all binary trees of size n the expectation of the total costs of the 
procedure satisfies the recurrence 


n—1 

CrCn-1- : 1 2 

(66) fn =tn +S) Af + fn) with Cn = —— ei 
k=0 3 


Indeed, the quantity 
CyCn-1-k 
Crh 
represents the probability that a random tree of size n has a left subtree of size k and a right 
subtree of size n — k. It is then natural to introduce the generating functions 


2) = Si tnCn2”, — f(2) =D) fnCn2z”, 


n>=0 n>0 


Pn,k = 


and the recurrence (66) translates into a linear equation: 


f(z) = tz) + 22C(z) f(z), 


with C'(z) the OGF of Catalan numbers. Now, given a toll sequence (tn) with ordinary gener- 


ation function 
23S 5. tnz”, 
n>0 
the function t(z) is a Hadamard product: t(z) = T(z) © C(z). Also, C(z) is well known, so 
that the fundamental relation is 


1-V1-4 
where L[r(z)] = Cee), C(z) = ee 
JI-i& 2z 
This transform relates the ordinary generating function of tolls to the normalized generating 
function of the total costs via a Hadamard product. 


(67) f(z) = £Ir(z)], 
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Tolls (tn) Costs (fn) 
: T(a-s 
Geo Dey oar 
2 
n3/? ——n’? + O(nlogn) 
Ma-¥ 
a-5s 
wm (eee) Fay? + Ofn) 
nil? Fira + O(n) 
n (0<a< 4) Kan + O(1) 
logn Kon + O(Vvn 


FIGURE VI.18. Tolls and costs for the binary tree recurrence. 


The calculation for simple tolls like n” with r € Zo can be carried out elementarily. For 
the tolls t} = n® what is required is the singular expansion of 


r(z)OC (=) = Li-a(z) OC (=) = S & (3) 


n=1 


This is precisely covered by Theorems VI.10 and VI.11. The results of Figure 18 follow, after 
routiné-calculations:. . 22.0.0... 4.iceeee kh ade ee age tae eee 288 END OF EXAMPLE VI.16. 


EXAMPLE VI.17. The Cayley tree recurrence. Consider n vertices labelled 1,...,n. There 
are (n — 1)!n”~* sequences of edges, 


(u1,%1,); (u2, V2, ), ais (Un—1,Un—1;)s 


that give rise to a tree over {1,...,}, and the number of such sequences is (n — 1)!n”~? since 
there aren”? unrooted trees of size n. At each stage k, the edges numbered | to k determine a 
forest. Each addition of an edge connects two trees and reduces the number of trees in the forest 
by 1, so that the forest evolves from the totally disconnected graph (at time 0) to an unrooted 
tree (at time n — 1). If we consider each of the sequences to be equally likely, the probability 
that un—1 and vn—1 belong to components of size k and (n — k) is 


1 n a 
2(n—1)\k nen 


(The reason is that there are n”~? unrooted trees; the last added edge has n — 1 possibilities 
and 2 possible orientations.) 

Assume that the aggregation of two trees into a tree of size equal to @ incurs a toll of te. 
The total cost of the aggregation process for a final tree of size n satisfies the recurrence 


k-1,)n—k-1 
(68) fn =tn + oD Dn k(fk + fn—k), Pn k = ma () 

O<k<n 
The recurrence (68) has been studied in detail by Knuth and Pittel [311], building upon earlier 
works of Knuth and Schonhage [312]. A prime motivation of the cited works is the importance 
of this recurrence in algorithms that dynamically manage equivalence relations (the so-called 
union-find algorithm [312]). 


414 VI. SINGULARITY ANALYSIS OF GF’S 


Tolls (tn) Costs (fn) 
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ug 
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FIGURE VI.19. Tolls and costs for the Cayley tree recurrence. 


Given the sequence of tolls (tn), we introduce the generating function 


T(z) = S- trZ™, 


n>1 


and let T be the Cayley tree function (I = ze”). For total costs, the generating function 
adopted is 
f(zZ= s fra *2”. 
n>1 
The basic recurrence (68) can then be rephrased as an integral transform involving a Hadamard 
product, namely, 

1 T(z) ‘ 2, dw 
Though the expression of the transform looks formidable at first sight, it is really nothing but a 
short sequence of basic operations, “Hadamard product, multiplication, differentiation, division, 
integration, multiplication”, each of which has a quantifiable effect on functions of singularity 
analysis class. (The singularity structure of T(z) is itself determined by the Singular Inversion 
Theorem, Theorem VI.6.) 

The net result is that the effect of tolls of the form n“, log n, and so on, can be analysed: 
see Figure 19 for a listing of estimates. Details of the proof are left as an exercise to our 
reader and are otherwise found in [163, §5.3]. The analogy of behaviour with the Catalan tree 
recurrence stands out. 


(69) f(z) = L[r(z)], with L£[r](z) 


This example is also of interest since it furnishes an analytically tractable model of a 
coalescence-fragmentation process. This is a topic of great interest in several areas of science, 
for which we refer to Aldous’ survey [6]. .................005 END OF EXAMPLE VI.17. 


VI.11. Tauberian theory and Darboux’s method 


There are several alternative approaches to the analysis of coefficients of func- 
tions, which are of moderate growth. Naturally, All methods provide estimates com- 
patible with singularity analysis methods (Theorems VI.1, VI.2, and VI.3). Each one 
requires some sort of “regularity condition” either on the part of the function or on the 
part of the coefficient sequence, the regularity condition of singularity analysis being 
in essence analytic continuation. 
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The methods briefly surveyed here fall into three broad categories: (¢) Elementary 
real analytic methods; (27) Tauberian theorems; (2/7) Darboux’s method. 

Elementary real analytic methods assume some a priori smoothness conditions on 
the coefficient sequence; they are included here for the sake of completeness, though 
properly speaking they do not belong to the galaxy of complex asymptotic methods. 
Their scope is mostly limited to the analysis of products while the other methods 
permit to approach more general functional composition patterns. Tauberian theorems 
belong to the category of advanced real analysis methods; they also needs some a 
priori regularity on the coefficients, typically positivity or monotonicity. Darboux’s 
method requires some smoothness of the function on the closed unit disk, and, by its 
techniques and scope, it is the closest to singularity analysis. 

We content ourselves with a brief discussion of the main results. For more infor- 
mation, the reader is referred to Odlyzko’s excellent survey [377]. 


Elementary real analytic methods. An asymptotic equivalent of the coefficients 
of a function can sometimes be worked out elementarily from simple properties of the 
component functions. The regularity conditions are a smooth asymptotic behaviour of 
the coefficients of one of the two factors in a product of generating functions. A good 
source for these techniques is Bender’s survey [29]. 

Theorem VI.12 (Bender’s method). Let a(z) = S> anz” and b(z) = Yo byz” be two 
power series with radii of convergence a > 2 > 0 respectively. Assume that b(z) 
satisfies the ratio test, 
bn—1 
bn 
Then the coefficients of the product f(z) = a(z) - b(z) satisfy, provided a(3) 4 0, 
[z"| f(z) ~ a(B)bn as n — OO. 
PROOF. (Sketch) The basis of the proof is the following chain: 


fo = obn + a1bn-1 + Gabn-2 + +++ + Gnbo) 
bn-1 bn—2 a 


— 6 as no Oo. 


= bn ao + ay +--+ + An— 


bn 
ye bn— 
ba (ao ban (PB) + aa( BP) +) 


bn (ao + a1 8 + a28? +--+). 


There, only the last line requires a little elementary analysis that is left as an exercise 
to the reader. 
This theorem applies for instance to the EGF of 2—regular graphs: 


r 


f(z) =a(z)-b(z) with ~—aa(z) se 7/2-*7/4 O(z) = 


1 
JV1—2z’ 
fow which it gives fy, ~ e—%/4 Ca Vow —, in accordance with Example 2 
(p. 378). Clearly, a whole collection of lemmas can be stated in the same vein. Sin- 


gularity analysis usually provides more complete expansions, though Theorem VI.12 
does apply to a few situations not covered by it. 
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Tauberian theory. Tauberian methods apply to functions whose growth is only 
known along the positive real line. The regularity conditions are in the form of ad- 
ditional assumptions on the coefficients (positivity or monotonicity) known under the 
name of Tauberian “side conditions”. An insightful introduction to the subject may 
be found in Titchmarsh’s book [469], and a detailed exposition in Postnikov’s mono- 
graph [399] and Korevaar’s compendium [317]. We cite the most famous of all Taube- 
rian theorems due to Hardy, Littlewood, and Karamata. In this section, a function is 
said to be slowly varying at infinity iff, for any c > 0, one has L(cx)/L(a) — 1 as 
x — +00; examples of slowly varying functions are provided by powers of logarithms 
or iterated logarithms. 


Theorem VI.13 (The HLK Tauberian theorem). Let f(z) be a power series with 
radius of convergence equal to 1, satisfying 


1 1 
70 eo if 
(70) fl) ~ aaae 
for some a > 0 with L a slowly varying function. Assume that the coefficients fy = 
[z”| f(z) are all non-negative (this is the “side condition”). Then 


(71) jis 
» *™ Tia +1) 


The conclusion (71) is consistent with what singularity analysis gives: Under the 
conditions, and if in addition analytic continuation is assumed, then 


(72) Ra ze 


which by summation yields the estimate (71). 

It must be noted that a Tauberian theorem requires very little on the part of the 
function. However, it gives less since it does not provide error estimates. Also, the 
result it provides is valid in the more restrictive sense of mean values, or Cesaro aver- 
ages. (If further regularity conditions on the f,, are available, for instance monotonic- 
ity, then the conclusion of (72) can then be deduced from (71) by purely elementary 
real analysis.) The method applies only to functions that are large enough at their 
singularity, and despite numerous efforts to improve the conclusions, it is the case that 
Tauberian theorems have little concrete to offer in terms of error estimates. 

Appeal to a Tauberian theorem is justified when a function has, apart from the 
positive half line, a very irregular behaviour near its circle of convergence, for in- 
stance when each point of the unit circle is a singularity. (The function is then said to 
admit the unit circle as a natural boundary.) An interesting example of this situation is 
discussed by Greene and Knuth [249] who consider the function 


(73) fe) = Ja+ 5) 
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that is the EGF of permutations having cycles all of different lengths. A little compu- 
tation shows that 


ot z a ee eee ee 
wsI+5) = US-3Letiles 
k=1 k=1 k=1 
~ log —7y+0(1) 


(Only the last line requires some care, see [249].) 
Thus, we have 
en? 


fe)~ 


by virtue of Theorem VI.12. In fact, Greene and Knuth were able to supplement this 
argument by a “bootstrapping” technique and show a stronger result, namely 


fnae”. 


= “fot htethyre, 


Darboux’s method. The method of Darboux requires, as regularity condition, 
that functions be smooth enough —.e., sufficiently differentiable— on their circle of 
convergence. What lies at the heart of this many—facetted method is a simple relation 
between the smoothness of a function and the corresponding decrease of its Taylor 
coefficients. 

Theorem VI.14 (Darboux’s method). Assume that f(z) is continuous in the closed 
disk |z| < 1, and is in addition k times continuously differentiable (k > 0) on |\z| = 1. 
Then 


1 
(74) ef) =0(). 
PROOF. Start from Cauchy’s coefficient formula 
1 dz 
fn = am [fo pntl" 


Because of the continuity assumption, one may take as integration contour C the unit 
circle. Setting z = e’® yields the Fourier version of Cauchy’s coefficient formula, 


a! 
~~ Oi Jo 


20 
(75) fe f(e®)e—™* dd. 
The integrand in (75) is strongly oscillating and the Riemann—Lebesgue lemma of 
classical analysis (see [469, p. 403]) shows that the integral giving f,, tends to 0 as 
n— oo. 
This argument covers the case k = 0. The case of a general /; is then derived 
through successive integrations by parts, as 


[2"|f(z) = sar . fe Je dd. 


Various consequences of Theorem VI.14 are given in reference texts also under 
the name of Darboux’s method. See for instance [98, 249, 265, 496]. We shall only 
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illustrate the mechanism by rederiving in this framework the analysis of the EGF of 
2-regular graphs. Clearly, we have 


e72/2-27/4 e3/4 
(76) f(z) = ——— = 4 e 4/1 — 24 R(z). 
T= 2 Lz 
There R(z) is the product of (1 — z)?/? with a function analytic at z = 1 that is a rest 


in the Taylor expansion of en 2/2-27/4. Thus, R(z) is of class C+, i.e., continuously 
differentiable once. By Theorem VI.14, we have 


so that 


(77) Anes ts (+) | 


Darboux’s method bears some resemblance to singularity analysis in that the esti- 
mates derive from translating error terms in expansions. However, smoothness condi- 
tions, rather than plain order of growth information, are required by it. The method is 
often applied in situations like in (76)-(77) to functions that are products of the type 
h(z)(1—z)® with h(z) analytic at 1, or combinations thereof. In such particular cases, 
Darboux’s method is however subsumed by singularity analysis. 

It is inherent to Darboux’s method that it cannot be applied to functions whose 
singular expansion only involves terms that become infinite, while singularity analy- 
sis can. A clear example arises in the analysis of the common subexpression prob- 
lem [209] where there occurs a function with a singular expansion of the form 


C1 
LS 
log + 


1-—z 


vi-# fog sh 


> VI.27. Darboux versus singularity analysis. This note provides an instance where Darboux’s 
method applies whereas singularity analysis does not. Let 


F(z) = Ley 


The function Fo(z) is singular at every point of the unit circle, and the same property holds for 
any F. with r € Zo. [Hint: Fo, which satisfies the functional equation F(z) = z + F(z”), 
grows unboundedly near 2”th roots of unity.] Darboux’s method can be used to derive 


IG SRe) = +0(2), en 2. 


What is the best error term that can be obtained? J 


VI. 12. Perspective 


The method of singularity analysis expands our ability to extract coefficient asymp- 
totics, to a far wider class of functions than the meromorphic and rational functions 
of Chapters IV and V. This ability is the fundamental tool for analysing many of 
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the generating functions of Chapters I-IIJ, and is applicable at a considerable level of 
generality. 

The basic method is straightforward and appealing: we locate singularities, es- 
tablish analyticity in a domain around them, expand the functions around the singular- 
ities, and apply general transfer theorems to take each term in the function expansion 
to a term in an asymptotic expansion of its coefficients. The method applies directly 
to a large variety of explicitly given functions, for instance combinations of ratio- 
nal functions, square roots, and logarithms, as well as to functions that are implicitly 
defined, like generating functions for tree structures, which are obtained by analytic 
inversion. Functions amenable to singularity analysis also enjoy rich closure prop- 
erties, and the corresponding operations mirror the natural operations on generating 
functions implied by the combinatorial constructions of Chapters I-III. 

This approach again sets us in the direction of the ideal situation of having a 
theory where combinatorial constructions and analytic methods fully correspond, but, 
again, the very essence of analytic combinatorics is that the theorems that provide 
asymptotic results cannot be so general as to be free of analytic side conditions. In 
the case of singularity analysis, these side conditions have to do with establishing 
analyticity in a domain around singularities. These conditions are automatically satis- 
fied by a large number of functions with moderate (at most polynomial) growth near 
their dominant singularities (most notably a large subset of the generating functions 
of combinatorial structures defined by the constructions of Chapters I-III) justifying 
precisely what we need: a term-by-term transfer from the expansion of a generating 
function at its singularity to function coefficients, including error terms. The calcula- 
tions involved in singularity analysis are rather mechanical. Salvy [424] has indeed 
succeeded in automating the analysis of a large class of generating functions in this 
way. 

Again, we can look carefully at specific combinatorial constructions and then ap- 
ply singularity analysis to general abstract schemas, thereby solving whole classes of 
combinatorial problems at once. This process (along with several important exam- 
ples) is the topic of the next chapter. After that, we consider the saddle point method 
(Chapter VIII), which is appropriate for functions with no singularities at a finite dis- 
tance (entire functions) as well as those whose growth is rapid (exponential) near their 
singularities. Singularity analysis will surface again in Chapter [X, given its crucial 
technical role in obtaining uniform expansions of multivariate generating functions 
near singularities. 


General surveys of asymptotic methods in enumeration have been given by Bender [29] 
and more recently Odlyzko [377]. A general reference to asymptotic analysis that has a remark- 
ably concrete approach is De Bruijn’s book [111]. Comtet’s [98] and Wilf’s [496] books each 
devote a chapter to these questions. 

This chapter is largely based on the theory developed by Flajolet and Odlyzko in [199], 
where the term “singularity analysis” originates from. An important early (and unduly ne- 
glected) reference is the study by Wong and Wyman [501]. The theory draws its inspiration 
from classical analytic number theory, for instance the prime number theorem where similar 
contours are used (see the discussion in [199] for sources). Another area where Hankel con- 
tours are used is the inversion theory of integral transforms [131], in particular in the case of 
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algebraic and logarithmic singularities. Closure properties developed here are from the arti- 
cles [163, 176] by Flajolet, Fill, and Kapur. 

Darboux’s method can often be employed as an alternative to singularity analysis. It is still 
the most widely used technique in the literature, though the direct mapping of asymptotic scales 
afforded by singularity analysis appears to us much more transparent. Darboux’s method is well 
explained in the books by Comtet [98], Henrici [265], Olver [381], and Wilf [496]. Tauberian 
theory is treated in detail in Postnikov’s monograph [399] and Korevaar’s encyclopedic treat- 
ment [317], with an excellent introduction to be found in Titchmarsh’s book [469]. 


VII 


Applications of Singularity Analysis 


Mathematics is being lazy. Mathematics is letting the principles do the work for you 
so that you do not have to do the work for yourself. 


— GEORGE Potya! 


I wish to God these calculations had been executed by steam. 


— CHARLES BABBAGE (1792-1871) 
PEAT ATA TRL | 
Seals Fes Wilt aed Fa @ Aalaa 
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Singularity analysis paves the way to the analysis of a large quantity of generating 
functions, as provided by the symbolic method exposed in Chapters I-III. In accor- 
dance with Pélya’s aphorism, it makes it possible to “be lazy” and “let the principles 
work for you”. In this chapter we illustrate this situation with numerous examples re- 
lated to languages, permutations, trees, and graphs of various sorts. Like in Chapter V, 
most analyses are organized into broad classes called schemas. 


First, we develop the general exp—log schema, which covers the set construction, 
either labelled or unlabelled, applied to generators whose dominant singularity is of 
logarithmic type (Section VII. 2). This typically nonrecursive schema parallels in gen- 
erality the supercritical schema of Chapter V, which is relative to sequences. It permits 
us to quantify various constructions of permutations, derangements, 2—regular graphs, 


Quoted in M Walter, T O’Brien, Memories of George Pélya, Mathematics Teaching 116 (1986) 


«There is an imperishable tree, it is said, that has its roots upward and its branches down and whose 
leaves are the Hymns. He who knows it possesses knowledge.” 
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mappings, and functional graphs, and even gives access to factorization properties of 
polynomials over finite fields. 


Next, we deal with recursively defined structures, whose study constitutes the 
main theme of this chapter. In that case, generating functions are accessible by means 
of equations or systems that implicitly define them. A distinctive feature of many such 
combinatorial types is that their generating functions have a square-root singularity, 
that is, the singular exponent equals 4. As a consequence, the counting sequences 
characteristically involve asymptotic terms of the form A"n~?/?, where the latter 
asymptotic exponent, —3, precisely reflects the singular exponent 4 in the function’s 
singular expansion, in accordance with the principles of singularity analysis presented 
in Chapter VI. 

Trees are the prototypical recursively defined combinatorial type. Square-root 
singularities automatically arise for all varieties of trees constrained by a finite set of 
allowed node degrees—binary trees, unary-binary trees, ternary trees, for instance— 
and many more. The counting estimates involve the characteristic n-2 subexponen- 
tial factor, a property that holds in the labelled and unlabelled frameworks alike (Sec- 
tion VII. 3). 

Simple varieties of trees have many properties in common, beyond the subexpo- 
nential growth factor of tree counts. Indeed, in a random tree of some large size n, 
almost all nodes are found to be at level about \/n, path length grows on average 
like n/n, and height is of \/n order with high probability. These results serve to 
unify classical tree types—we say that such properties of random trees are univer- 
sal’ amongst all simply generated families sharing the square-root singularity prop- 
erty. (This notion of universality, borrowed from physics, is also nowadays finding 
increasing popularity amongst probabilists, for reasons much similar to ours.) In this 
perspective, the motivation for organizing the theory along the lines of major schemas 
fits perfectly with the quest of universal laws in analytic combinatorics. 

In the context of simple varieties of trees, the square-root feature arises from 
general properties of the inverse of an analytic function. Under suitable conditions, 
this property can be extended to functions defined implicitly by a functional equation. 
Consequences are the general enumeration of nonplane unlabelled trees, including iso- 
mers of alkanes in theoretical chemistry, as well as secondary structures of molecular 
biology (Section VII. 4). 

Much of this chapter is devoted to context-free specifications and languages. In 
that case, generating functions are a priori algebraic functions, meaning that they sat- 
isfy a system of polynomial equations, itself optionally reducible (by elimination) to 
a single equation. For solutions of positive polynomial systems, square-root singu- 
larities are found to be the rule under a simple technical condition of irreducibility 


(Section VII. 6) that is evocative of the Perron-Frobenius conditions encountered in 

Be ..] this echoes the notion of universality in statistical physics. Phenomena that appear at first to be 
unconnected, such as magnetism and the phase changes of liquids and gases, share some identical features. 
This universal behaviour pays no heed to whether, say, the fluid is argon or carbon dioxide. All that matters 
are broad-brush characteristics such as whether the system is one-, two- or three-dimensional and whether 
its component elements interact via long- or short-range forces. Universality says that sometimes the details 
do not matter.” [From “Utopia Theory”, in Physics World, August 2003]. 
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Chapter V in relation to finite-state and transfer-matrix models. As an illustration, 
we show how to develop a coherent theory of topological configurations in the plane 
(trees, forests, graphs) that satisfy a non-crossing constraint. 


For arbitrary algebraic functions (the ones that are not necessarily associated to 
positive coefficients and equations, or irreducible positve systems), a richer set of sin- 
gular behaviours becomes possible: singular expansions involve fractional exponents 
(not just 4, corresponding to the square-root paradigm above). Singularity analy- 
sis is invariably applicable (Section VII.7). Algebraic functions are viewed as plane 
algebraic curves and one can make use of the famous Newton-Puiseux theorem of 
elementary algebraic geometry, which completely describes the types of singularities 
thay may occur. Algebraic functions surface as solutions of various types of functional 
equations: this turns out to be the case for many classes of walks that generalize Dyck 
and Motzkin paths, via what is known as the kernel method, as well as for many types 
of planar maps (embedded planar graphs), via the so-called quadratic method. In all 
these cases, singular exponents of a predictable (rational) form are bound to occur, 
implying in turn numerous quantitative properties of random discrete structures. 


Differential equations and systems are associated to recursively defined structure, 
when either pointing constructions or order constraints appear. For counting gen- 
erating functions, the equations are nonlinear, while the GFs associated to additive 
parameters lead to linear versions. Differential equations are also central in connec- 
tion with the holonomic framework", which gives access to the enumeration of many 
classes of “hard” objects, like regular graphs and Latin rectangles. Singularity analysis 
is once more instrumental in working out precise asymptotic estimates. We examine 
here (Section VII. 9) applications relative to quadtrees and to varieties of increasing 
trees, some of which are closely related to permutations as well as to algorithms and 
data structures for sorting and searching. 


VII. 1. A roadmap to singularity analysis asymptotics 


The singularity analysis theorems of Chapter VI, which may be coarsely summa- 
rized by the correspondence 


(1) f(z)~(l—2/p)"* fie ene, 


serve as our main asymptotic engine throughout this chapter. Singularity analysis 
is instrumental in quantifying properties of nonrecursive as well as recursive struc- 
tures. Our reader might be surprised not to encounter integration contours anymore 
in this chapter: it now suffices to work out the local analysis of functions at their 
singularities, then the general theorems of singularity analysis (Chapter VI) effect the 
translation to counting sequences and parameters automatically. 


4Holonomic functions (APPENDIX B: Holonomic functions, p. 693) are defined as solutions of linear 
differential equations with coefficients that are rational functions. 
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The exp-log schema. This schema is relative to the labelled set construction, 
(2) F=SETG) => F(z) =exp(G(z)), 


as well as its unlabelled counterparts, MSET and PSET: an ¥-structure is thus con- 
structed (nonrecursively) as an unordered assembly of G-components. In the case 
where the GF of components is logarithmic at its dominant singularity, 


(3) GO) cig 2 ee. 
1—2z/p 


an immediate computation shows that F(z) has a singularity of the power type, 
F(z) ~e*(1—z/p)“, 


which is clearly in the range of singularity analysis. The construction (2), supple- 
mented by simple technical conditions of the form (3), defines the exp—log schema. 
Then, for such ¥-structures that are definable as assemblies of logarithmic compo- 
nents, the asymptotic counting problem is systematically solvable (Theorem VII.1, 
p. 428); the number of G-components in a large random F-structure is O(log n), both 
in the mean and in probability, while more refined estimates describe precisely the 
likely shape of profiles. This schema has a generality comparable to the supercriti- 
cal schema examined in Section V. 4, p. 313, but the probabilistic phenomena at stake 
appear to be in sharp contrast: the number of components is typically small, being log- 
arithmic in n for exp-log sets, as opposed to a linear growth in the case of supercritical 
sequences. The schema can be used to analyse properties of permutations, functional 
graphs, mappings, and polynomial over finite fields. 


Recursion and the universality of square-root singularity. A major theme of this 
chapter is the study of asymptotic properties of recursive structures. In an amazingly 
large number of cases, functions with a square root singularity are encountered, and 
given the usual correspondence, 


f(z)~—(L-z)¥? fn ~ 
2W7n3 
the corresponding coefficients are of the asymptotic form C’ po ?n-3/ 2. Several schemas 
can be described to capture this phenomenon; we develop here, in order of increas- 
ing structural complexity, the ones corresponding to simple varieties of trees, implicit 
structures, Pélya operators, and irreducible polynomial systems. 

Simple varieties of trees and inverse functions. Our treatment of recursive combi- 
natorial types starts with simple varieties of trees. In the basic situation, that of plane 
unlabelled trees, the equation is 


(4) Y = Z X SEQQ(Y) = Y(z) = z(¥(z)), 


with, as usual, ¢(w) = )0,cq w%. Thus, the OGF Y (z) is determined as the inverse 
of w/(w), where the function ¢ reflects the collection of all allowed node degrees 
(Q). From analytic function theory, we know that singularities of the inverse of an 
analytic function are generically of the square-root type (Chapters IV and VI), and 


VII. 1. A ROADMAP TO SINGULARITY ANALYSIS ASYMPTOTICS 425 


such is the case whenever (2 is a “well behaved” set of integers, in particular, a finite 
set. Then, the number of trees satisfies the estimate 


(5) Yn = [2"]¥ (z) ~ CA"n- 3/2, 
Square-root singularity is also attached to several universality phenomena, as evoked 
in the general introduction to this chapter. 


Tree-like structures and implicit functions. Functions defined implicitly by an 
equation of the form 


(6) Y(z) = G(z, Y(z)) 


where G is bivariate analytic, has nonnegative coefficients, and satisfies a natural 
set of conditions (Theorem VII.3, p. 447) also lead to square-root singularity. The 
schema (6) generalizes (4): simply take G(z, y) = z¢(y). 

Trees under symmetries and Polya operators. The analytic methods evoked above 
can be further extended to Pélya operators, which translate unlabelled set and cycle 
set constructions. A typical application is to the class nonplane unlabelled trees whose 
OGF satisfies the infinite functional equation, 


Hh) =e ae it ie +) 


Singularity analysis applies more generally to varieties of nonplane unlabelled trees 
(Theorem VIL.4, p. 457), which covers the enumeration of various types of interesting 
molecules in combinatorial chemistry. 


Context-free structures and polynomial systems. The GF of any context-free class 
or language is known to be a component of a system of positive polynomial equations 


Ye oS Biteeuiiss fue) 


Yr = Pelee Uig waste 


The n—/? counting law is again universal amongst such combinatorial classes un- 
der a basic condition of “irreducibility” (Theorem VII.5, p. 461). In that case, the 
GFs are algebraic functions satisfying a strong positivity constraint, and the corre- 
sponding analytic statement constitutes the important Drmota-Lalley-Woods Theorem 
(Theorem VII.6, p. 466). 

Note that there is a progression in the complexity of the schemas leading to 
square-root singularity. From the analytic standpoint, this can be roughly rendered 
by a chain 


inverse functions ——> implicit functions —> systems. 


It is however often meaningful to treat each combinatorial problem at its minimal 
level of generality, since results tend to become less and less explicit as complexity 
increases. 
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rational Irred. linear system ¢~™ Perron-Frob., merom. fns, 
Ch. V 

—_— General rational Co" nt meromorphic functions, 
Ch. V 

algebraic __Irred. positive sys. ¢ ie a DLW Th., sing. analysis, 
This chapter, §VII. 6, p. 460 

— General algebraic 9 ¢7"n"/* Puiseux, sing. analysis, 
This chapter, §VII. 7, p. 469 

holonomic Regular sing. 6 "n* log” n ODE, sing. analysis, 
This chapter, §VII. 9.1, p. 493 

= l/r 

— Irregular sing. Cem n* log’ n ODE, saddle-point, 

Ch. VII 


FIGURE VII.1. A telegraphic summary of a hierarchy of special functions by increas- 
ing level of generality: asymptotic elements composing coefficients and the coefficient 
extraction method (with @,r € Z>o, [ € Q, 0 algebraic, and P a polynomial). 


General algebraic functions. In essence, the coefficients of all algebraic func- 
tions can be analysed asymptotically. (There are only minor limitations arising from 
the possible presence of several dominant singularities, like in the rational function 
case.) The starting point is the characterization of the local behaviour of an algebraic 
function at any of its singularities, which is provided by the Newton-Puiseux theorem: 
if ¢ is a singularity, then the branch Y(z) of an algebraic function admits near ¢ a 
representation of the form 


(7) ¥(z)=ZP/9) So ee Z4), = Z = (1-2/0), 
k>0 


for some r/s € Q, so that the singular exponent is invariably a rational number. 
Singularity analysis is also systematically applicable, so that the nth coefficient of Y 
is expressible as a finite linear combination of terms, each of the asymptotic form 


(8) ne, So; 


see also Figure 1. The various quantities (like ¢, 7, s) entering the asymptotic expan- 
sion of the coefficients of an algebraic function turn out to be effectively computable. 

Beside providing a wide-encompassing conceptual framework of independent in- 
terest, the general theory of algebraic coefficient asymptotics is applicable whenever 
the combinatorial problems considered is not amenable to any of the special schemas 
previously described. For instance, certain kinds of “supertrees” (these are defined 
as trees composed with trees, Example 10, p. 394) lead to the singular type 71/4, 
which is reflected by an unusual subexponential factor of n—°/4 present in asymptotic 
counts. Maps, which are planar graphs drawn in the plane (or on the sphere), satisfy a 
universality law with a singular exponent equal to 3, which is associated to counting 


sequences involving an asymptotic n—°/? factor. 
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Differential equations and systems. When recursion is combined with point- 
ing or with order constraints, enumeration problems translate into integro-differential 
equations. Section VII.9 examines the types of singularities that may occur in two 
important cases: (7) linear differential equations; (iz) nonlinear differential equations. 

Linear differential equations arise from the analysis of parameters of splitting pro- 
cesses that extend the framework of tree recurrences (Chapter VI), and we treat the 
geometric quadtree structure in this perspective. Another especially important source 
of linear differential equations is the class of holonomic functions (solutions of linear 
equations with rational coefficients, cf APPENDIX B: Holonomic functions, p. 693), 
which includes GFs of Latin rectangles, regular graphs, permutations restricted by the 
length of their longest increasing subsequence, Young tableaux and many more struc- 
tures of combinatorial theory. In an important case, that of a “regular” singularity, 
asymptotic forms can be systematically extracted. The singularities that may occur ex- 
tend the algebraic ones (7), and the corresponding coefficients are then asymptotically 
composed of elements of the form 


(9) 6" n® (log ny’, 


(0 an algebraic quantity, £ € Z>o), which is more general than (8). 

Nonlinear differential equations are typically attached to the enumeration of trees 
satisfying various kinds of order constraints. A general treatment is intrinsically not 
possible, given the extreme diversity of singular types that may occur. Accordingly, 
we restrict attention to first-order nonlinear equations of the form 


£Y(2) = 4(V(2)), 
which covers varieties of increasing trees, including several models closely related to 
permutations. 

Figure 1 summarizes three classes of special functions encountered in this book, 
namely, rational, algebraic, and holonomic. When structural complexity increases, a 
richer set of asymptotic coefficient behaviours becomes possible. (The complex as- 
ymptotic methods employed extend much beyond the range suggested by the figure. 
For instance, the class of irreducible positive systems of polynomial equations are part 
of the general square-root singularity paradigm, also encountered with Pdélya opera- 
tors, as well as inverse and implicit functions in non-algebraic cases.) 


VII. 2. Sets and the exp—log schema 


In this section, we examine a schema that is structurally comparable to the super- 
critical sequence schema of Section V. 4, p. 313, but that requires singularity analysis 
for coefficient extraction. The starting point is the construction of permutations (P) as 
labelled sets of cyclic permutations (X): 


(10) P = SET(K) = P(z) = exp(K(z)), where K(z) = log 


8. 
—_— 2 


which gives rise to many easy explicit calculations. For instance the probability that 
it 


a random permutation consists of a unique cycle is — (since it equals K,,/P,,); the 


number of cycles is asymptotic to logn, both on average (p. 112) and in probability 
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(Example III.4, p. 148); the probability that a random permutation has no singleton 
cycle is ~ e~+ (the derangement problem; see pp. 113 and 216). 

Similar properties hold true under surprisingly general conditions. We start with 
definitions that describe the combinatorial classes of interest. 


Definition VII.1. A function G(z) analytic at 0, having nonnegative coefficients and 
finite radius of convergence p is said to be of (k, )-logarithmic type, where « 4 0, if 
the following conditions hold: 


(i) the number p is the unique singularity of G(z) on |z| = p; 
(it) G(z) is continuable to a A-domain at p; 

(iti) G(z) satisfies, 

+r+0 — 


1 a) nan 

TER asz— pind. 
—2/p (log( —aaF 

Definition VII.2. The labelled construction F = SET(G) is said to be a labelled 
exp-log schema if the exponential generating function G(z) of G is of logarithmic 
type. 

The unlabelled construction F = MSET(G) is said to be an unlabelled exp-log 
schema if the ordinary generating function G(z) of G is of logarithmic type. 


G(z) = kK log i 


By the fact that G(z) has positive coefficients, we must have «& > 0, while the sign 
of X is arbitrary. The definitions and the main properties to be derived for unlabelled 
multisets easily extend to the powerset construction: see Notes 1 and 5 below. 


Theorem VII.1 (Exp—log schema). Consider an exp-log schema with parameters 
(x, A). 


(i) The counting sequences satisfy 


[e"|G(2) = =p" (140 ((logn)-”)), 
ertro 
[2"|F(z) = Tn) netp 2 (1 +O ((log ey) ; 


where ro = 0 in the labelled case and ro = >) 539 G(p’)/j in the case of unlabelled 
multisets. 
(it) The number X of G-components in a random F-object satisfies 


Er, (X) = K(logn—W(x))+A+r1 +0 ((logn)*) (W(s) = Zs), 


where 7, = 0 in the labelled case and 71 = ))j39 G(p’) in the case of unlabelled 
multisets. The variance satisfies Vz,,(X) = O(logn), and, in particular, the distri- 
bution of X is concentrated around its mean. 


We shall see in Chapter IX that, in addition, the asymptotic distribution of X is invariably 
Gaussian under such exp-log conditions. 

PROOF. This result is from an article by Flajolet and Soria [210], with a correction 
to the logarithmic type condition given by Jennie Hansen [257]. We first discuss the 
labelled case, F = SET(G), so that F(z) = exp G(z). 
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(i) The estimate for [z”]G(z) follows directly from singularity analysis with log- 
arithmic terms (Theorem VI.4, p. 376). Regarding F'(z), we find, by exponentiation, 


(11) Fe)= 7 |1+0(7 4+, )]. 


Like G, the function F = e@ has an isolated singularity at p, and is continuable to 
a A-domain, in which the expansion (11) is valid. The basic transfer theorem then 
provides the estimate of [z”] F(z). 

(ii) Regarding the number of components, the BGF of F with u marking the 
number of G—components is F'(z,u) = exp(uG(z)), in accordance with the general 
developments of Chapter III. The function 


fi(z) := £ F(z,0) 


is the EGF of the cumulated values of X. It satisfies near p 


Xr 
f(z) = 


e 
whose translation, by singularity analysis theory is immediate: 


= F(z)G(z), 


u=1 


G- zor («os +) 1 +0( Gear) 


2") fi(z) = Ex, (X) = Tw’ (Klogn — K(k) + A+ O ((logn)~")). 
This provides the mean value estimate of X as [2"]fi(z)/[z"|F(z). The variance 
analysis is conducted in the same way, using a second derivative. 

For the unlabelled case, the analysis of [z"|G(z) can be recycled verbatim. First, 
given the assumptions, we must have p < 1 (since otherwise [z”|G(z) could not be 
an integer). The classical translation of multisets (Chapter I) rewrites as 


SS. G(zi 
F(e)=exp(Ge) +R), Re) = SS. 
j=2 
where R(z) involves terms of the form G(z?), ..., each being analytic in |z| < p!/?. 
Thus, R(z) is itself analytic, as a uniformly convergent sum of analytic functions, in 
lz) < pi! 2 (This follows the usual strategy for treating Pélya operators in asymptotic 
theory.) Consequently, F(z) is A-analytic. As z — p, we then find 


(12) F(z) = ae ji+0 Ceresnal oie => ate) 


jaa 


The asymptotic expansion of [z"]F'(z) then results from singularity analysis. 
The BGF F'(z, wu) of F, with wu marking the number of G-components, is 


uG(z) 2 u*G(z*) i -) ; 


Feu) = ex ( ; 5 


Consequently, 


= F(z) (G(z)+ Ri(z)),  —- Ri(z) = ys G(z). 


=2 


S 

Il 

a 
& 
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a n=100 n=272 n=739 
Permutations 5.18737 6.18485 7.18319 
Derangements 4.19732 5.18852 6.18454 
2-regular 2.53439 3.03466 3.53440 
Mappings 2.97898 3.46320 3.95312 


FIGURE VII.2. Some exp-—log structures (F) and the mean number of G—-components 
for n = 100, 272 = [100 - e|, 739 = [100- e? J. 


Again, the singularity type is that of F(z) multiplied by a logarithmic term, 


zp 


(13) filz)_~ F(e\(G(z) +1), n=) Gr’). 


The mean value estimate results. Variance analysis follows similarly. 


> VIL1. Unlabelled powersets. For the powerset construction F = PSET(G), the statement 
of Theorem VII.1 holds with 


Tro= —1 ie te). 
2 ) F 

as seen by an easy adaptation of the proof techniques. <q 

As we see below, beyond permutations, mappings, unlabelled functional graphs, 

polynomials over finite fields, 2-regular graphs, and generalized derangements resort 

to the exp-log schema; see Figure 2 for representative numerical data. Furthermore, 

singularity analysis gives precise information on the decomposition of large F objects 
into G components. 


EXAMPLE VII.1. Cycles in derangements. The case of permutations corresponds to radius 
of convergence p = 1 and parameters F(«, A) = (1,0), and is immediately seen to satisfy the 
conditions of Theorem VII.1. Let Q be a finite set of the integers and consider next the class 
D = D® of permutations without any cycle of length in Q. This includes standard derange- 
ments (where 2. = {1}). The specification is then 


D(z) = exp(K(2)) 


{ D= SET(K) = 1 Pied 
K = CyCz.5\0(Z) G(z) = log =a Ly ae 
wEQ 
The theorem applies, with « = 1, := — oie w'. In particular, the mean number of cycles 


in a random generalized derangement of size n is log n + O(1). END OF EXAMPLE VIL.1. 


EXAMPLE VII.2. Connected components in 2-regular graphs. The class of (undirected) 2— 
regular graphs is obtained by the set construction applied to components that are themselves 
undirected cycles of length > 3 (see p. 123 and Example VI.2, p. 378). In that case: 


F = ser(@) 24 eee eas : 

G = UCYC33(Z ee fee ape ae 
aH) Ge) a a 

This is an exp—log scheme with k = 4 and \ = —3. In particular the number of components 


a 
2 


is asymptotic to 5 log n, both in the mean and in probability. .. END OF EXAMPLE VII.2. 
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EXAMPLE VII.3. Connected components in mappings. The class F of mappings (functions 
from a finite set to itself) has been introduced in Subsection II.5.2, p. 119. The associated 
digraphs are described as labelled sets of connected components (XK), themselves (directed) 
cycles of trees (T), so that the class of all mappings has an EGF given by 
1 

1—T(z)’ 
with T the Cayley tree function. The analysis of inverse functions (Section VI.7 and Exam- 
ple VI8, p. 386) has shown that T(z) is singular at z = e+, where it admits the singular 
expansion T(z) ~ 1 — V/2,/T — ez. Thus G(z) is logarithmic with « = 4 and \ = — log V2. 
As a consequence, the number of connected mappings satisfies 


Kn = nl[2"]K(z) = oe (1 2 O(n-"/?)) 


In other words: the probability for a random mapping of size n to consist of a single component 
is ~ ,/s_. Also, the mean number of components in a random mapping of size n is 


F(z) = exp(K(z)), K(z) = log T(z) = zeT™), 


5 logn + log V2e7 + O(n-"/?), 


Similar properties hold for mappings without fixed points that are analogous to derangements 
and were discussed in Chapter II, p. 121. We shall establish below, p. 458, that unlabelled 
functional graphs also adhere to the exp-log schema. .......... END OF EXAMPLE VII.3. 


EXAMPLE VII.4. Factors of polynomials over finite fields. Factorization properties of ran- 
dom polynomials over finite fields are of importance in various areas of mathematics and have 
applications to coding theory, symbolic computation, and cryptography [41, 487, 437]. (Exam- 
ple 1.18, p. 83, offers a preliminary discussion). 

Let F,, be the finite field with p elements and P = F,,[X] the set of monic polynomials with 
coefficients in the field. We view these polynomials as (unlabelled) combinatorial objects with 
size identified to degree. Since a polynomial is specified by the sequence of its coefficients, one 
has (with A the “alphabet” of coefficients, A = F, treated as a collection of atomic objects), 


_ 1 
~ 1— pz’ 


(14) P = SEQ(A) => P(z) 


and there are p” monic polynomials of degree n. 

Polynomials are a unique factorization domain, since they can be subjected to Euclidean 
division. A nonconstant polynomial that has no proper nonconstant divisor is termed irreducible 
—irreducibles are the analogues of the primes in the integer realm. The unique factorization 
property implies that the collection of all polynomials is combinatorially isomorphic to the 
multiset class of the collection of irreducibles: 

1 ieee: 
(15) P&MSET(Z) = P(z) = exp (1 + 512’) + 312") +. +) ; 
The conjunction of (14) and (15) yields a functional relation determining [(z) implicitly, which 
can be solved by taking logarithms and then making use of MG6bius inversion: we find 


_ yn LA) A 52 
ins > ; WE pe 8 


k>1 


at RE): 


where R(z) is analytic in |z| < p-'/?. Thus I(z) is of logarithmic type with parameters 


= _y ve) 1 
K=1, A= De ee 


k>2 
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(X41) (X10 4 X94 X84 X84 x44 X98 41) (X44 XN 4 x4 x8 41) 
x3 (X 41) (x? 4X41)? (174x164 x15 4 XM 4 x9 4 XO 4 x74 X41) 


X5(XK $1) (X54X9 4X? 4X41) (x12 4X8 4X74 X04 X54 X94X27 4X41) (x? +x +1) 


X? (X? 4K + )? (x8 +x? +1) (X84X74 x84 x44 x? 4X41) (x8 4x74x54x441) 
(XT 4+.X84 X54 X34x24X+41) (x18 4 x17 4 x18 4 x94 X84 KT XO 4 X441) 


FIGURE VII.3. The factorizations of five random polynomials of degree 25 over F2. 
One out of five polynomials in this sample has no root in the base field (the asymptotic 
probability is + by Note 4). 


There results that I, ~ p”/n, which constitutes a “Prime Number Theorem” for polynomials 
over finite fields: A fraction asymptotic to + of the polynomials in Fp|X] are irreducible. This 
says that a polynomial of degree n is roughly comparable to a number written in base p having n 
digits: in effect, the proportion of prime numbers amongst numbers whose representation has 
length n is asymptotic to 1/(n log p), by virtue of the classical Prime Number Theorem. 

Since I(z) is logarithmic and P is obtained by a multiset construction, we have an exp-log 
scheme and Theorem VII.1 applies. As a consequence: The number of factors of a random poly- 
nomial of degree n is ~ log n on average and its distribution is concentrated. (See Figure 3.) 
This and similar developments lead to a complete analysis of some of the basic algorithms 
known for factoring polynomials over finite fields; see [186]. .. END OF EXAMPLE VII.4. 


> VII.2. The divisor function for polynomials. Let 6(@) for @ € P be the total number of 
monic polynomials (not necessarily irreducible) dividing cw: if @ = v{1---;", where the u; 
are distinct irreducibles, then d(@) = (e1 + 1)--- (ex +1). One has 


2" Tjsi(l +22? +324 +--+) [oJ P(e)? 
[27] T]ja1C1 + 29 + 277 +--+) [2"]P(z) ’ 


EPn (6) = 


so that the mean value of 6 over Pn is exactly (n + 1). This evaluation is relevant to poly- 
nomial factorization over Z since it gives an upper bound on the number of ireducible factor 
combinations that need to be considered in order to lift a factorization from F,(X) to Z(X); 
see [308, 487]. 


> VIL.3. The cost of finding irreducible polynomials. Assume that it takes expected time t(n) 
to test arandom polynomial of degree n for irreducibility. Then it takes expected time ~ nt(n) 
to find a random polynomial of degree n: simply draw a polynomial at random and test it for 
irreducibility. Testing for ireducibility can be achieved by developing a polynomial factorization 
algorithm which is stopped as soon as a nontrivial factor is found. See works by Panario et al. 
for detailed analyses [383, 384]. J 


Profiles of exp-log structures. Under the exp—log conditions, it is also possible 
to analyse the profile of structures, that is, the number of components of size r for 
each fixed r. We recall here that the Poisson distribution of parameter v is the law of 
a discrete random variable Y such that 


uY)=er), PY =k) =e" S. 
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A variable Y is said to be a negative binomial of parameter (m, qa) if its probability 
generating function and its individual probabilities satisfy: 


(UY) = ( xe ) 3 PY Sk) (mr Nakao), 


1—au 
(The quantity P(Y = k) is the probability that the mth success in a sequence of 
independent trials with individual success probability a occurs at time m-+k; see [162, 
p. 165].) 
Proposition VII.1 (Profiles of exp—log structures). Assume the conditions of Theo- 
rem VIII and let X“") be the number of G-components of size r in an F-object. In the 
labelled case, X") admits a limit distribution of the Poisson type: for any fixed k, 


k 
¥ 7: vl Tr — r 
(16) lim Pr,(X =k)=e", v=grp", gr = (2G). 


In the unlabelled case, X“") admits a limit distribution of the negative binomial type: 


for any fixed k, 
(17) 
rtk—-1 
lim Pz,(X =k) = (“ vi Jena —a)®, a=p", G,=[2"]G(). 


PRooF. In the labelled case, the BGF of F with u marking the number X“”) of r— 
components is 
F(z,u) = exp ((u— 1)9r2") F(z). 

Extracting the coefficient of u* leads to 

r\k 
r\ \Gr% 
bx (z) := [u* F(z, u) = exp (—gr2") ( a ) 
The singularity type of ¢;(z) is that of F'(z) since the prefactor (an exponential mul- 
tiplied by a polynomial) is entire, so that singularity analysis applies directly. As a 
consequence, one finds 


F(z). 


r\k 
f"]ox(2) ~ exp (—gr0") SPL. (a r(2)), 


which provides the distribution of X (") under the form stated in (16). 
In the unlabelled case, the starting BGF equation is 


Pea (=) (2), 


1— uz" 


and the analytic reasoning is similar to the labelled case. 


The unlabelled version of Proposition VII.1 covers in particular polynomials over 
finite fields; see [186, 298] for related results. 
[> VIL4. Mean profiles. The mean value of X“” satisfies 


Ba (X)~ 900", Bey (X)~ Gr, 
in the labelled and unlabelled (multiset) case respectively. In particular: the mean number of 
roots of a random polynomial over Fy that lie in the base field F, is asymptotic to eat: Also: 
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Plane Non-plane 
V=Z x Ga(V) Y= Z x Mo(V) 
Unlabelled (OGF) | V(z) = 20(V(z))_ | V(z) = 2®(V(2))) 
$(u) = DV oeqgu” (® a Polya operator) 
Vv = Z« Ga(V) Vv = Z * Po(V) 
Labelled (EGF) | V(z)=29(V(z)) | V(z) = 20(V(2) 
ou) =Dueot’ | 9 = Duco 


FIGURE VII.4. Functional equations satisfied by generating functions of degree- 
restricted families of trees. 


the probability that a polynomial has no root in the base field is asymptotic to (1 — 1/p)?. (For 
random polynomials with real coefficients, a famous result of Kac (1943) asserts that the mean 
number of real roots is ~ 2 log n; see [146].) J 


> VILS5. Profiles of powersets. In case of unlabelled powersets F = PSET(G) (no repetitions 
of elements allowed), the distribution of X (") satisfies 


lim Pz, (X =k) = ) ak(L—a)or*, a= ree 
n—0o pe” 


i.e., the limit is a binomial law of parameters (Gr, p’ /(1 + p")). dq 


VII. 3. Simple varieties of trees and inverse functions 


A unifying theme in this chapter is the enumeration of rooted trees determined 
by restrictions on the collection of allowed node degrees—some set 2. C Zo con- 
taining O (for leaves) and at least another number d > 2 (to avoid trivialities) being 
fixed, all nodes outdegrees are constrained to lie in 2. Corresponding to the four com- 
binations, unlabelled/labelled and plane/nonplane, there are four types of functional 
equations summarized by Figure 4. In three of the four cases, namely, 


unlabelled plane, labelled plane, and labelled nonplane, 


the generating function (OGF for unlabelled, EGF for labelled) satisfies an equation 
of the form 


(18) y(z) = 29(y(z)). 
In accordance with earlier conventions (p. 182), we name simple variety of trees any 
family of trees whose GF satisfies an equation of the form (18). (The functional equa- 
tion satisfied by the OGF of a degree-restricted variety of nonplane unlabelled trees 
furthermore involves a Pélya operator ®, which implies the presence of terms of the 
form y(z?), y(z3),...: such cases are discussed in Section VII. 5 below.) 

The relation y = z¢@(y) has already been examined in Section VI.7, p. 385, 
from the point of view of singularity analysis. For convenience, we encapsulate the 
conditions of the main theorem of that section, Theorem VI.6, p. 387, into a definition. 
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Definition VII.3. Let y(z) be a function analytic at 0. It is said to belong to the 
smooth inverse-function schema if there exists a function ¢(u) analytic at 0, such that 
in a neighbourhood of 0 one has 


y(z) = z6(y(2)), 


and ¢(u) satisfies the following conditions. 
— Condition (H1): The function $(u) is such that 


(19) 0(0) £0, [u"]e(u) 20, e(u) # bo + dit. 


— Condition (Hz): Within the open disc of convergence of ¢ at 0, |z| < R, there 
exists a (necessarily unique) positive solution to the characteristic equation: 


(20) dr,0<7<R, o(r)-Td(r) =0. 


A class Y whose generating function y(z) (either ordinary or exponential) satis- 
fies these conditions is also said to belong to the smooth inverse-function schema. 
The schema is said to be aperiodic if b(wu) is an aperiodic function of u. 


VII. 3.1. Asymptotic counting. As we saw on general grounds in Chapters IV 
and VI, inversion fails to be analytic when the first derivative of the function to be 
inverted vanishes (hence the characteristic equation). The heart of the matter is that, 
at the point of failure y = 7, corresponding to z = 7 /¢(rT) (the radius of convergence 
of y(z) at 0), the dependency y ++ z becomes quadratic, so that its inverse z +> y 
gives rise to a square-root singularity, from which the typical n—°/? term in coefficient 
asymptotics results (Theorem VI.6, p. 387). In view of our needs in this chapter, we 
rephrase Theorem VI.6 as follows. 


Theorem VII.2. Let y(z) belong to the smooth inverse-function schema, in the sense 
of Definition VII.3, and be aperiodic. Then, with T the positive root of the characterstic 
equation and p = T/@(r), one has 


ee) ~ fr 2 [+0 (=). 


As we also know from Theorem VI.6, a full (locally convergent) expansion of 
y(z) in powers of \/1 — z/p exists, starting with 


29(7) 
wy 


which implies a full asymptotic expansion for y,, = [z”]y(z) in odd powers of 1/,/n. 
(The statement extends to the aperiodic case, with the necessary condition that n = 1 
mod p, when ¢ has period p.) 

We have seen already that this framework covers binary, unary-binary, general 
Catalan, as well as Cayley trees (Figure 10, p. 389). Here is another typical applica- 
tion. 


(21) y(z) =T-yV1-2z/p+O(1—-2z/p), y= 
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EXAMPLE VII.5. Mobiles. A (labelled) mobile, as defined by Bergeron, Labelle, and Ler- 
oux [39, p. 240], is a (labelled) tree in which subtrees dangling from the root are taken up to 


MTLA (I A cA 


1 2 3!4+3=9 4+4x24+4x3+4x3x2=68 


(Think of Alexander Calder’s creations.) The specification and EGF equation are 


1 
M = Zx(1+CycM) => M(2)=2 (1+ 75). 
(By definition, cycles have at least one components, so that the neutral structure must be added 
to allow for leaf creation.) The EGF starts as M(z) = z+ es + 9s + 685 + 7305 fees, 
whose coefficients constitute E7S 4038037. 
The verification of the conditions of the theorem are immediate. We have ¢(u) = 1+ 
log(1 — u)~+, whose radius of convergence is 1. The characteristic equation reads 


T 


1+ log =0, 


l1—-r 1-7 


which has a unique positive root at r + 0.68215. (In fact, one has tr = 1 — 1/T(e~?), with T 
the Cayley tree function.) The radius of convergence is p = 1/¢'(r) = 1 — T. The asymptotic 
formula for the number of mobiles then results: 


— Mn waa Y: where C = 0.18576, A = 3.14461. 


(This example is adapted from [39, p. 261], with corrections.) .. END OF EXAMPLE VII.5. 


> VIIL.6. Trees with node degrees that are prime numbers. Let P be the class of all plane un- 
labelled trees such that the (out)degrees of internal nodes belong to the set of prime numbers, 
{2,3,5,...}. One has P(z) = z+ 234 24422°462°9 4827 +2928 450294 ---, 
and P, ~ Cw"n~*/?, with w = 2.79256 84676. The asymptotic form “forgets” many de- 
tails of the distribution of primes, so that it can be obtained to great accuracy. (Compare with 
Example V.10, p. 317 and Note 23, p. 458.) <q 


VII. 3.2. Basic tree parameters. Throughout this subsection, we consider a sim- 
ple variety of trees V, whose generating function (OGF or EGF, as the case may be) 
will be denoted by y(z), satisfying the inverse relation y = z¢(y). In order to place 
all cases under a single umbrella, we shall write y,, = [z”]y(z), so that the number of 
trees of size n is either V,, = yp (unlabelled case) or V, = n!y, (labelled case). We 
postulate throughout that y(z) belongs to the smooth inverse-function schema and is 
aperiodic. 

As already seen on several occasions in Chapter II (Section III. 5, p. 170), addi- 
tive parameters lead to generating functions that are expressible in terms of the basic 
tree generating function y(z). Now that singularity analysis is available, such gener- 
ating functions can be exploited systematically, with a wealth of asymptotic estimates 
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relative to trees of large sizes coming within easy reach. The universality of the square- 
root singularity amongst varieties of trees that satisfy the smoothness assumption then 
implies universal behaviour for many tree parameters, which we now list. 


— Node degrees. The degree of the root in a large random tree is O(1) on 
average and with high probability, and its asymptotic distribution can be 
generally determined (Example 6). A similar property holds for the degree 
of a random node in a random tree (Example 8). 

— Level profiles can also be determined. The quantity of interest is the mean 
number of nodes in the kth layer from the root in a random tree. It is seen 
for instance that, near the root, a tree from a simple variety tends to grow lin- 
early (Example 7), this in sharp contrast with other random tree models (for 
instance, increasing trees, Subsection VII.9.2, p. 500), where the growth is 
exponential. This property is one of the numerous indications that random 
trees taken from simple varieties are skinny and far from having perfectly 
balanced shape. A related property is the fact that path length is on average 
O(n,/n) (Example 9), which means that the typical depth of a random node 
in a random tree is O(,/7). 


These basic properties are ony the tip of an iceberg. Indeed, Meir and Moon, who 
launched the study of simple varieties of trees (the seminal paper [356] can serve as a 
good starting point) have worked out literaly several dozen analyses of parameters of 
trees, using a strategy similar to the one exposed here>. We shall have occasion in later 
chapters to return to probabilistic properties of simple varieties of trees satisfying the 
smooth inverse-function schema—we only indicate here for completeness that height 
is known generally to scale as ,/n and is associated to a limiting theta distribution (see 
Proposition V.4, p. 306 for the special case of Catalan trees and [197, 180, 314] for 
general results), with similar properties holding true for width as shown by Odlyzko— 
Wilf and Chassaing-Marckert- Yor [83, 379]. 


EXAMPLE VII.6. Root degrees in simple varieties. _ Here is an immediate application of 
singularity analysis, one that exemplifies the synthetic type of reasoning that goes along with 
the method. Take for notational simplicity a simple family V that is unlabelled, with OGF 
V(z) = y(z). Let V™! be the subset of V composed of all trees whose root has degree equal 
to k. Since a tree in Y'*! is formed by appending a root to a collection of k trees, one has 


VE l(z) = bezy(z)*, be = [w*] ow). 


For any fixed k, a singular expansion results from raising both members of (21) to the kth power; 
in particular, 


(22) VEl(z) = onz [r Skye? 3 +O (1 = =) 


This is to be compared to the basic estimate (21): the ratio vil /Vn is then asymptotic to the 
ratio of the coefficients of \/1 — z/p in the corresponding generating functions, viel (z) and 


5The main difference is that Meir and Moon appeal to the Darboux-Pélya method discussed in Sec- 
tion VI. 11 instead of singularity analysis. 
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Tree o(w) T, Pp PGF of root degree (type) 

simple variety Wes aes te ol 

binary (Bernoulli) 
unary-binary (Bernoulli) 

general (sum of two geometric) 
Cayley (shifted Poisson) 


FIGURE VII.5. The distribution of root degree in simple varieties of trees of the smooth 
inverse-function schema. 


V(z) = y(z). Thus, for any fixed k, we have found that 


Vat 
Vn 


(23) = pkbat*® 14+ O(n-/”). 
(The error term is in fact of the form O(n~*), as seen when pushing the expansion one step 
further.) 

The ratio V,.*! /Vn is the probability that the root of a random tree of size n has degree k. 
Since p = 1/¢'(7), one can rephrase (23) as follows: In a smooth simple variety of trees, the 
random variable A representing root-degree admits a discrete limit distribution given by 


(24) lim Py,(A =k) = 


(By general principles exposed in Chapter IX, convergence is uniform.) Accordingly, the prob- 
ability generating function (PGF) of the limit law admits the simple expression 


Ey, (w*) = u¢!(ru)/o'(r). 


The distribution is thus characterized by the fact that its PGF is a scaled version of the deriv- 
ative of the basic tree constructor ¢(w). Figure 5 summarizes this property together with its 
specialization to our four pilot examples. .................... END OF EXAMPLE VII.6. 


Additive functionals. Singularity analysis gives access to many additive param- 
eters of trees. Consider three tree parameters, €, 7), 0 satisfying the basic relation, 


deg(t) 
(25) E(t) =n(t)+ S_ (ts), 
j=l 


which can be taken to define €(¢) in terms of the simpler parameter 7(t) and the sum 
of values of o over the root subtrees of t (deg(t) is the degree of the root of t and t; is 
the jth root subtree). As we are interested in average-case analysis, we introduce the 
cumulative GFs, 


26) Ee) = reall, He) = Sonal", D(z) = Wolpe", 


assuming an unlabelled variety of trees for simplicity. We first state a simple algebraic 
result which formalizes several of the calculations of Section III. 5, p. 170. 
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Lemma VII.1 (Iteration lemma for trees). For tree parameters from a simple variety 
with GF y(z) that satisfy the additive relation (25), the cumulative generating func- 
tions (26), are related by 


(27) 3(z) = H(z) +2¢'(y(2)E2). 
In particular, if € is defined recursively in terms of n, that is, o = &, one has 
A(z) zy’ (2) 
(28) E(z) = —————_— = H(z). 
1—2zd/(y(z)) —_-y(2) 
In the case of a recursive parameter, unwinding the recursion shows that €(t) := )7,.,7/(s), 


where the sum is extended to all subtrees s of t (written s < ft). 
PROOF. We have 


deg(t) 
E(z) = H(z)+=(z), where E(z):= 5° [ 2! $0 o(t,) 
tev j=l 


Spitting the expression of E(z) according to the values r of root degree, we find 


Elz) = Sogpa2ttlalt-+lel (6(t1) + (te) +--+ o(tr)) 
r>0 
= 20 br (E(z)y(2)! + yl) B(2)y (2)? + y(2)"2E(2)) 
r>0 
= 2D(z)- So (réry(z)""), 
r>0 


which yields the linear relation expressing = in (27). 

In the recursive case, the function = is determined by a linear equation, namely 
E(z) = H(z) + 2¢’(y(z))E(z), which, once solved, provides the first form of (28). 
Differentiation of the fundamental relation y = z¢(y) yields the identity 


y(1—2¢'(y)) =ey)="2, ie, 1-2y= 4, 


z zy 


from which the second form results. 


> VII.7. A combinatorial interpretation. For a recursive parameter, we can view E(z) as the 
GF of trees with one subtree marked, to which is attached a weight of 7. Then (28) can be 
interpreted as follows: point to an arbitrary node at a tree in V (the GF is zy’(z)), “subtract” the 


tree attached to this node (a factor of y(z)~'), and replace it by the same tree but now weighted 
by 77 (the GF is H(z)). <q 


> VIIL.8. Labelled varieties. Formulae (27) and (28) hold verbatim for labelled trees (either 
of the plane or nonplane type), provided we interpret y(z),&(z), H(z) as EGFs: E(z) := 
ray €(t)z!*!/|t|!, and so on. <q 


EXAMPLE VII.7. Mean level profile in simple varieties. The question we address here is 
that of determining the mean number of nodes at level k (i.e., at distance & from the root) in a 
random tree of some large size n. An explicit expression for the joint distribution of nodes at 
all levels has been developed in Subsection III. 6.2, p. 182, but this multivariate representation 
is somewhat hard to interpret in concrete terms. 
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Let €;(t) be the number of nodes at level & in tree t. Define the generating function of 

cumulated values, 
Xgle) = Yee". 
tev 

Clearly, Xo(z) = y(z) since each tree has a unique root. Then, since the parameter &;, is 
the sum over subtrees of parameter £,—1, we are in a situation exactly covered by (27), with 
n(t) = 0. The relation Xx (z) = z2¢'(y(z))Zx—1(z), is then immediately solved by recurrence, 
to the effect that 


k 
(29) Xx(z) = (2¢'(y(z)))” y(2)- 

Making use of the (analytic) expansion of ¢’ at 7, namely, ¢'(y) ~ 6’(7) + 6" (7)(y — 7) and 
of pd’(T) = 1, one gets for any fixed k 


Xu(2) ~ (1 br0"(r),/1 = 2) (« w1-2) wr atred"(ryk+ f=. 


Thus comparing the singular part of X_(z) to that of y(z), we find: For fixed k, the mean 
number of nodes at level k in a tree is of the asymptotic form 


Ey, [e] ~ Ak +1, A:= Tp" (r). 


This result was first given by Meir and Moon [356]. The striking fact is that, although the 
number of nodes at level & can at least double at each level, growth is only linear on average. 
In figurative terms, the immediate vicinity of the root starts like a “cone”, and trees of simple 
varieties tend to be rather skinny near their base. 

When used in conjunction with saddle point bounds, the exact GF expression of (29) addi- 
tionally provides a probabilistic upper bound on the height of trees of the form O(n am 
any 6 > 0. Indeed restrict z to the interval (0, p) and assume that k = n‘/?+°. Let x be the 
height parameter. First, we have 


for 


(30) Py, (x 2 k) = Ey, ([& 2 1) < Ev, (&). 

Next by saddle point bounds, for any legal positive x (0 < x < Reonv(@)), 

(31) Ev, (Ek) S (e6'(y(a)))" (aa <7 (20 (y(a)))P ao. 

Fix now z = p— Be: Local expansions then show that 

(32) log ((xo'(u(@)))" =”) <—-Kn®*/? +0 (n°) ; 

for some positive constant kK. Thus, o) (30) and (32): In a smooth simple variety of trees, 
1/2+ 


the probability of height exceeding n is exponentially small, being of the rough form 
exp(—n*°/?). Accordingly, the mean height is O(n'/?*°) for any 5 > 0. Flajolet and 
Odlyzko [197] have characterized the moments of height, the mean being in particular asymp- 
totic to \./n and the limit distribution being of the Theta type already encountered in Chapter V 
in the particular case of general Catalan trees, where explicit expressions are available. (Further 
local limit and large deviation estimates appear in [180].) Figure 6 displays three random trees 
Of SIZE. = 00! eat ie ee eae AS EA eet ea aA eaehia END OF EXAMPLE VII.7. 


> VIL9. The variance of level profiles. The BGF of trees with u marking nodes at level k 
has an explicit expression, in accordance with the developments of Chapter III. For instance 
for k = 3, this is zb(zb(zd(uy(z)))). Double differentiation followed by singularity analysis 
shows that 


Wve lea] ~ 5 A2a? 548 4A)k+7A-1, 
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FIGURE VII.6. Three random 2-3 trees (Q = {0, 2,3}) of size n = 500 have height 
respectively 48, 57, 47, in agreement with the fact that height is typicaly O(./n). 


another result of Meir and Moon [356]. The precise analysis of the mean and variance in the 
interesting regime where k = \/n is also given in [356], but it requires the saddle point method 
of Chapter VIII or the methods of Chapter IX. dq 


EXAMPLE VII.8. Mean degree profile. Let €(t) = &,(t) be the number of nodes of degree k 
in random tree of some variety V. The analysis extends that of the root degree seen earlier. The 
parameter € is an additive functional induced by the basic parameter 7(t) = 7x (t) defined by 
ne (t) := [deg(t) = k]. By the analysis of root degree, we have for the GF of cumulated values 
associated to 7 


H(z) = dezy(z)", oe = [w"] ow), 
so that, by the fundamental formula (28), 
X(2) = bxey(2 HE) = Poy y'G) 


The singular expansion of zy’(z) results from that of y(z) by differentiation (Chapter VI, 


1 1 
z) = =y === + O(1), 
2) Aa (1) 


and the corresponding coefficient is [z”](zy’) = nyn. This gives immediately the singularity 
type of X, which is of the form of an inverse square root. Thus, 


X(z) ~ poet**(zy"(z)) 


implying (9 = +/@(r)) 
Xn dbet® 


— TT 


myn (7) 


Consequently, one has: 
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Proposition VII.2. Jn a smooth simple variety of trees, the mean number of nodes of degree k 
is asymptotic to Ayn, where Ax := ont /o(7). Equivalently, the probability distribution of the 
degree A’ of a random node in a random tree of size n satisfies 


k 
lim P,(A’) =x = or with PGF : Doe = ae 


(type) 


binary 1,3 PGF: = 4 (Bernoulli) 
unary-binary ao Ae z PGF: 3 4 (Bernoulli) 
general 4,4 PGF: 1/(2 (Geometric) 
Cayley 1,e—' | PGF: e*~! (Poisson) 


For instance, asymptotically, a general Catalan tree has on average 5 leaves, nodes of de- 
gre 1 3 of degree 2, and so on; a Cayley tree has ~ ne—'/k! nodes of degree k; for bi- 
nary (Catalan) trees, the four possible types of nodes each appear each with asymptotic fre- 
quency +. (These data agree with the fact that a random tree under Y,, is distributed like a 
branching process tree determined by the PGF ¢(ur)/@(7); see Subsection III. 6.2, p. 182.) 


END OF EXAMPLE VII.8. 


> VII.10. Variances. The variance of the number of k-ary nodes is ~ vn, so that the distribu- 
tion of the number of nodes of this type is concentrated, for each fixed k. The starting point is 
the BGF defined implicitly by 


¥(2,u) = 2 (4(¥(z,u)) + de(u- DY (z,u)*), 


upon taking a double derivative with respect to u, setting u = 1, and finally performing singu- 
larity analysis on the resulting GF of cumulated values. 


> VII.11. The mother of a random node. The discrepancy in distributions between the root 
degree and the degree of a random node deserves an explanation. Pick up a node distinct from 
the root at random in a tree and look at the degree of its mother. The PGF of the law is in 
the limit ud’ (ur)/¢'(7). Thus the degree of the root is asymptotically the same as that of the 
mother of any non-root node. 

More generally, let _X have distribution pz := P(X = k). Construct a random variable Y 
such that the probability g, := P(Y = &) is proportional both to k and pz. Then for the 
associated PGFs, the relation g(u) = p’(u)/p’(1) holds. The law of Y is said to be the size- 
biased version of the law of X. Here, a mother is picked up with an importance proportional to 
its degree. In this perspective, Eve appears to be just like a random mother. 


EXAMPLE VII.9. Path length. Path length of a tree is the sum of the distances of all nodes to 
the root. It is defined recursively by 


deg(t) 


E() =I 1+ Do tb). 


In this case, we have 7(t) = |t| — 1 corresponding to the GF of cumulated values H(z) = 
zy’(z) — y(z), and the fundamental relation (28) gives 
/ 2,/ 2 
zy'(z) _ zy'(2) ' 
X(z) = (zy"(z) — yz) =  — 2y"(2). 

y(z) y(2) 
The type of y’(z) at its singularity is Z~'/?, where Z := (1 — z/p). The formula for X(z) 
involves the square of y’, so that the singularity of X(z) is of type Z~*, resembling a simple 


VII. 3. SIMPLE VARIETIES OF TREES AND INVERSE FUNCTIONS 443 


pole. This means that the cumulated value X,, = [z”|X(z) grows like p~”, so that the mean 
value of € over Vp has growth n3/?. Working out the constants, we find 
X(2)+ay(2)~ $5 +O(Z-”), 


Asa consequence: 


Proposition VII.3. In a random tree of size n from a smooth simple variety, the expectation of 
path length satisfies 


(33) Ey, (€) =Avan? + O(n), A= ee 


For our classical varieties, the main terms of (33) are then: 


Binary Unary-binary _—_ General Cayley 


Observe that the quantity =Ey,, (€) represents the expected depth of a random node in random 
tree (the model is then [1..”] x V,), which is thus ~ A./n. (This result is consistent with 
height of a tree being with high probability of order O(n'/?).) . END OF EXAMPLE VIL.9. 


> VIL12. Variance of path length. Path length can be analysed starting from the bivariate gen- 
erating function given by a functional equation of the difference type (see Chapter III, p. 174), 
which allows for the computation of higher moments. The standard deviation is found to be 
asymptotic to Aon?/? for some computable constant Az > 0, so that the distribution is spread. 
(Louchard [339] and Takacs [460] have additionally worked out the asymptotic form of all mo- 
ments, leading to a characterization of the limit law of path length that can be described in terms 
of the Airy function and coincides with the Brownian excursion area.) dq 


> VIL13. Generalizations of path length. Define the subtree size index of order a € R>o to 
be €(t) = €4(t) := 30,_,|s|*, where the sum is extended to all the subtrees s of t. This 


corresponds to a recursively defined parameter with 7(t) = |t|*. The results of Section VI. 10 
relative to Hadamard products and polylogarithms make it possible to analyse the singularities 
of H(z) and X(z). It is found that there are three different regimes 


i es i 
a>s a=5 a<5 


Ey, (6) ~ Kon® Ev, (6) ~ Kijanlogn Ev, () ~ Kan 


where each Kq is a computable constant. (This extends the results of Subsection VI. 10.3, 
p. 409 to all simple varieties of trees that are smooth.) <q 


VII. 3.3. Mappings. The basic construction of mappings, 


F =  SET(K) F = exp(k) 
T = ZxSET(T) cae get. 


builds maps from Cayley trees, which constitute a smooth simple variety. The con- 
struction lends itself to a number of multivariate extensions. For instance, the param- 
eter y(#) equal to the number of cyclic points gives rise to the BGF 


) =(1-uT)7*. 


F =€ 1 
(z,u) exp (log ar 
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# components ~ gZlogn aoe ~ /an/ : 
# cyclic nodes ~ /an/2 cycle length (uw) ~ \/mn/ 


: as tree size ~n/3 
# terminal nodes ~ ne? ; / 
component size ~ 2n/3 


FIGURE VII.7. Expectations of the main additive parameters of random mappings of size n. 


The mean number of a cyclic points in a random mapping of size n is accordingly 


n! 0 n! T 
n= = —(z"| | —F, = —([z"|——_~... 
iE l= Sb (sree) )= See 
Singularity analysis is immediate as 
T 1 1 T 1 


nn 
(1-—T) ee se-8 D1 ee ” a 


n 


(1-—T)? iene 2° , 


The mean number of cyclic points in a random n-mapping is asymptotic to \/7n/2. 
A large number of parameters can be analysed in this way systematically as shown in 
the survey [198]: see Figure 7 for a summary of results whose proof we leave as an 
exercise to the reader. The leftmost table describes global parameters of mappings; 
the rightmost table is relative to properties of random point in random n-mapping: 
is the distance to its cycle of a random point, jz the length of the cycle to which the 
point leads, tree size and component size are respectively the size of the largest tree 
containing the point and the size of its (weakly) connected component. In particular, a 
random mapping of size n has relatively few components, some of which are expected 
to be of a fairly large size. 

The properties outlined above for the class of all maping also prove to be universal 
for a wide varety of mappings defined by degree restrictions of various sorts. 


EXAMPLE VII.10. Simple varieties of mappings. Let 2 be a subset of the integers and 
consider mappings @ € F such that the number of preimages of any point is constrained to lie 
in Q. Such special mappings may serve to model the behaviour of special classes of functions 
under iteration, and are accordingly of interest in various areas of computational number theory 
and cryptography. For instance the quadratic functions (a) = «?+a over F, have the property 
that each element y has either zero, one, or two preimages (depending on whether y — ais a 
quadratic nonresidue, 0, or a quadratic residue). 

The basic construction of mappings needs to be amended. Start with the family of trees T 
that are the simple variety corresponding to 22: 


T=2¢(T), (w= >> = 


i wl 
wEQ 


w 


At any vertex on a cycle, one must graft r trees with the constraint that r + 1 € Q (since one 
edge is coming from the cycle itself). Such legal tuples with a root appended are represented by 


U = 2¢'(T), 


since @ is an exponential generating function and shift corresponds to differentiation. Then 
connected components and components are formed in the usual way by 


K = logy. F = exp(K) = ——. 
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We assume that ¢ (i.e., 22) satisfies the general conditions of Theorem VII.2, with 7 the charac- 
teristic value. Then T(z) has a square-root singularity at p = 7/#(7). The same holds for U 
which satisfies the singular expansion 


(35) U(z) ~ 1 pe" (7) /1- = 


since U = z¢'(T). Thus, eventually, 


There results the universality of an n—+/? Jaw in such constrained mappings, 


lan Agee 


nl” Jim 
which nicely extends what is known to hold for unrestricted mappings. The analysis of additive 
functionals can then proceed on lines very similar to the case of standard mappings, to the 
effect that the estimates of Figure 7 hold, albeit with different multiplicative constants. The 
programme just sketched has been carried out in a thorough way by Amey and Bender in [14] 
to which we refer for a detailed treatment. .................. END OF EXAMPLE VII.10. 


> VIL.14. Probabilities of first-order sentences. A beautiful theorem of Lynch [348], much in 
line with the global aims of analytic combinatorics, gives a class of properties of random map- 
pings for which asymptotic probabilities are systematically computable. In logics, a first-order 
sentence is built out of variables, equality, boolean connectives (V,/,—, etc), and quantifiers 
(V, 4). In addition, there is a function symbol y, representing a generic mapping. 


Theorem. Given a property P expressed by a first-order sentence, let [in (P) be the 
probability that P is satisfied by a random mapping ¢p of size n. Then the quantity 
boo(P) = limn—oo Un (P) exists and its value is given by an expression consisting 
of integer constants and the operators +,—, X,-~+, and e”. 


For instance: 
Be yisperm. |y without fixed pt. y has #leaves > 2 
Vasye(y) = 2] Varp(e)=2 |Ax,yle AyAVzle(2) F&A v2) Fy 
woo(P)] 0 |e ] 


One can express in this language a property like Piz : “all cycles of length 1 are attached to 


—1 

trees of height at most 2”, for which the limit probability is e~'T° “te” The proof of the theo- 
rem is based on Ehrenfeucht games supplemented by ingenious inclusion-exclusion arguments. 
(Most examples, like P;2, can be directly treated by singularity analysis.) Compton [94, 95, 96] 
has produced lucid surveys of this area of logics, known as finite model theory. <q 


VII. 4. Tree-like structures and implicit functions 


The goal of this section is to show that universality of the square-root singularity 
type holds for classes of recursively defined structures, which considerably extend 
the case of (smooth) simple varieties of trees. The starting point is the investigation of 
labelled recursive classes V, with associated GF y(z), that are given by a construction, 


(36) Y=61Z,YV = (2) = Gz, y(2)) 
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where 6 may be an arbitrary composition of basic constructors reflected by a bivari- 
ate function G(z, w) in the labelled case. This situation covers for instance hierar- 
chies (Chapter II), Schréder’s generalized systems (Chapter I), paths with diagonal 
steps, as well as trees with variable node sizes or edge lengths. The unlabelled uni- 
verse also benefits from the technology developed for functions implicitly defined by 
y = G(z,y). This technology then makes it possible to estimate counting sequences 
and parameters of many recursive structures, when Polya operators are involved (Sec- 
tion VIL. 5, p. 453). 


VII. 4.1. The smooth implicit-function schema. The investigation of (36) ne- 
cessitates certain analytic conditions to be satisfied by the bivariate function G, which 
we first encapsulate into the definition of a schema. 


Definition VIL4. Let y(z) be a function analytic at 0, y(z) = doys0 Yn2", with 
Yo = Oand yn > 0. The function is said to belong to the smooth implicit-function 
schema if there exists a bivariate G(z, w) such that 


y(z) = G(z, y(z)), 
where G(z, w) satisfies the following conditions. 
— (hh): G(z,w) = Vn nso Imnz"w" is analytic in a domain |z| < R and 
\w| < S,forsomeR,S>0. 
— (Ig): The coefficients of G satisfy 


(37) Imn 29, go0=9, goi #1, 

Jmn > 0 for some m and for some n > 2. 
— (Is): There exist two numbers r, s, such that0 <r < Rand0 < 8s < S, satisfying 
the system of equations, 


(38) G(r,s)=s, Gyw(r,s) = 1, wih r<R, 8s <S, 


which is called the characteristic system. A class Y with such a generating function 
y(z) is also said to belong to the smooth implicit-function schema. 


Postulating that G(z, w) is analytic and with nonnegative coefficients is a minimal 
assumption in the context of analytic combinatorics. The problem is assumed to be 
normalized, so that y(0) = 0 and G(0,0) = 0, the condition go,; 4 1 being imposed 
to avoid that the implicit equation be of the reducible form y = y + --- (first line 
of (37)). The second condition of (37) means that in G(z, y), the dependency on y 
is nonlinear (otherwise, the analysis resorts to rational and meromorphic asymptotic 
methods of Chapter V). The major analytic condition is (I3), which postulates the 
existence of positive solutions r,s to the characteristic system within the domain of 
analyticity of G. 

The main result® due to Meir and Moon [360] expresses universality of the square- 
root singularity together with its usual consequences regarding asymptotic counting. 


6This theorem has an interesting history. An overly general version of it was first stated by Bender in 
1974 (Theorem 5 of [29]). Canfield [77] pointed out ten years later that Bender’s conditions were not quite 
sufficient to grant square-root singularity. A corrected statement was given by Meir and Moon in [360] 
with a further (minor) erratum in [359]. We follow here the form given in Theorem 10.13 of Odlyzko’s 
survey [377] with the correction of another minor misprint (regarding go,1 which should read go,1 # 
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Theorem VII.3 (Smooth implicit-function schema). Let y(z) belong to the smooth 
implicit-function schema defined by G(z, w), with (r,s) the positive solution of the 
characteristic system. Then, y(z) converges at z = r where it has a square-root 
singularity, 


ye) =,s-WT= FF + OL 2/r), =f FHS, 


the expansion being valid in a A-domain. If, in addition, y(z) is aperiodic’, then r is 
the unique dominant singularity of y and the coefficients satisfy 


n _ Y rot not F 
e"(2) = spagr "(1+ On) 


Observe that the assumptions imply the existence of exactly one root of the char- 
acteristic system within the part of the positive quadrant where G is analytic, since, 
obviously, y,, cannot admit two asymptotic expressions with different parameters. A 
complete expansion exists in powers of (1 — z/r)!/? (for y(z)) and in powers of 1/n 
(for y,), while periodic cases can be treated by a simple extension of the technical 
apparatus to be developed. 

The proof of this theorem first necessitates two statements of independent inter- 
est: (i) Lemma VII.2 is logically equivalent to an analytic version of the classical Im- 
plicit Function Theorem found in APPENDIX B: Implicit Function Theorem, p. 698. 
(ii) Lemma VII.3 supplements this by describing what happens at a point where the 
implicit function theorem “fails”. (These two statements extend the analytic and the 
singular inversion lemma of Subsection IV. 7.1, p. 261.) 

Lemma VII.2 (Analytic Implicit Functions). Let F(z,w) be z bivariate function 
analytic at (z,w) = (Z0,Wo). Assume that F(z0,wo) = 0 and F.,(z0,wo) # 0. 
Then, there exists a unique function y(z) analytic in a neighbourhood of 2 such that 
y(Zo0) = wo and F(z, y(z)) = 0. 

PROOF. This is a restatement of the Analytic Implicit Function Theorem of APPEN- 
DIX B: Implicit Function Theorem, p. 698, upon effecting a translation z > z+ Zo, 
wt w+ wo. (This property extends the Analytic Inversion Lemma IV.2, p. 262.) 


Lemma VII.3 (Singular Implicit Functions). Let F(z, w) be a bivariate function an- 
alytic at (z,w) = (Zo, Wo). Assume the conditions: F (zo, wo) = 0, Fz (20, wo) 4 9, 
Fy (Z0, Wo) = 0, and Fuw(z0, wo) 4 0. Choose an arbitrary ray of angle 6 emanat- 
ing from zo. Then there exists a neighbourhood Q of zo such that at every point z of Q 
with z #4 29 and z not on the ray, the equation F(z, y) = 0 admits two solutions y1(z) 
and y2(z) that are analytic in Q. slit along a ray and satisfy, as z > 2 


n(2)=w—yVT= 27m +O (= 2/20), y= yf EON) 


1). A statement concerning a restricted class of functions (either polynomial or entire) already appears in 
Hille book [268, p. 274]. 

7In the usual sense: f(z) = ¥,, fnz” is aperiodic if there exist three indices i < 7 < k such that 
fifi fe #0 and ged(j — i,k — 6) = 1, 
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FIGURE VII.8. The connection problem for the equation w = 42 + w? (with explicit 


forms w = 1+ V1 — z): the combinatorial solution y(z) near z = 0 and the two analytic 
solutions y1(z), y2(z) near z = 1. 


and similarly for yz whose expansion is obtained by changing \/ to —,/. 
PROOF. Locally, near (r,s), the function F(z, w) behaves like 

1 
(39) F+(w-s)Fyt+(z-—r)F,+ gw — 8)" Fuow, 


(plus smaller order terms), where F and its derivatives are evaluated at the point (r, s). 
Since F = F,, = 0, cancelling (39) suggests for the solutions of F(z, w) = 0 near 
z =r the form 


w-s=+ 7Vr—-—2+O0(z-71), 

which is consistent with the statement. This informal argument can be justified by the 
following steps (details omitted): (a) establish the existence of a formal solution in 
powers of +(1— z/r)!/?; (b) prove, by the method of majorant series, that the formal 
solutions also converge locally and provide a solution to the equation. 

Alternatively, by the Weierstrass Preparation Theorem (see again APPENDIX B: 
Implicit Function Theorem, p. 698) the two solutions yi(z), y2(z) that assume the 
value s at z = r are solutions of a quadratic equation 


(Y —s)? +.O(z)(Y —s) +c(z) =0, 


where 6 and c are analytic at z = r, with b(r) = c(r) = 0. The solutions are then 
obtained by the usual formula for solving a quadratic equation, 


¥—9=5 (We) ViGP— 2), 


which provides for y;(z) an expression as the square-root of an analytic function and 
yields the statement. (This property extends the Singular Inversion Theorem VI.6, 
p. 387.) 
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It is now possible to return to the proof of our main statement. 


PROOF. [Theorem VII.3] Given the two lemmas, the general idea of the proof of The- 
orem VII.3 can be easily grasped. Set F(z, w) = w — G(z, w). There exists a unique 
analytic function y(z) satisfying y = G(z,y) near z = 0, by the analytic lemma. 
On the other hand, by the singular lemma, near the point (z, w) = (r,s), there exist 
two solutions y1, y2, both of which have a square root singularity. Given the positive 
character of the coefficients of G, it is not hard to see that, of y1, yo, the function y;(z) 
is increasing as z approaches from the left (assuming the principal determination of 
the square root in the definition of 7). A simple picture of the situation regarding the 
solutions to the equation y = G(z, y) is exemplified by Figure 8. 

The problem is then to show that a smooth analytic curve (the dotted curve in 
Figure 8) does connect the solution at 0 to the increasing branch solution at r. Pre- 
cisely, one needs to check that y1(z) (defined near 1) is the analytic continuation of 
y(z) (defined near 0) as z increases along the positive real axis. This is indeed a deli- 
cate connection problem whose technical proof is discussed in Note 15. Once this fact 
is granted and it has been verified that r is the unique dominant singularity of y(z) 
(Note 16), the statement of Theorem VIL.3 follows directly by singularity analysis. 


> VIL15. The connection problem for implicit functions. A proof that y(z) and y1(z) are well 
connected is given by Meir and Moon in [360] from which our description is adapted. 

Let p be the radius of convergence of y(z) at 0 and tr = y(p). The point p is a singularity 
of y(z) by Pringsheim’s Theorem. The goal is to establish that p = r and 7 = s. Regarding 


the curve 
C={(z,y(z)) | O<z<p}, 
this means that three cases are to be excluded: 
(a) C stays entirely in the interior of the rectangle 


R= {(@,9) | O0O<z<r, O0<y<s}. 
(b) C intersects the upper side of the rectangle R at some point of abscissa ro < r where 


y(ro) = s. 
(c) C intersects the rightmost side of the rectangle R at the point (r, y(r)) with y(r) < s. 


Graphically, the three cases are depicted in Figure 9. 


FIGURE VII.9. The three cases 
(a), (b), and (c), to be excluded 
(solid lines). 


(c) 


In the discussion, we make use of the fact that G'(z, w), which has nonnegative coefficients 
is an increasing function in each of its argument. Also, the form 


Gz(z,y) 


(40) YT Gala) 
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shows differentiability (hence analyticity) of the solution y as soon as Gz(z,y) #1. 

—Case (a) is excluded. Assume that0 < p < rand0 <7 < s. Then, we have 
Gw(r,s) = 1, and by monotonicity properties of G~, the inequality G(p,7) < 1 holds. But 
then y(z) must be analytic at z = p, which contradicts the fact that p is a singularity. 

—Case (b) is excluded. Assume that 0 < ro < rand y(ro) = s. Then there are two 
distinct points on the implicit curve y = G(z, y) at the same altitude, namely (ro, s) and (r,s), 
implying the equalities 

y(ro) = G(ro, y(ro)) = 8 = G(r, s), 
which contradicts the monotonicity properties of G. 

—Case (c) is excluded. Assume that y(r) < s. Let a < r be a point chosen close 
enough to r. Then above a, there are three branches of the curve y = G(z,y), namely 
y(a), y1(a), y2(a), where the existence of y1, y2 results from Lemma VII.3. This means that 
the function y +> G(a, y) has a graph that intersects the main diagonal at three points, a con- 
tradiction with the fact that G(a, y) is a convex function of y. <q 


> VII.16. Unicity of the dominant singularity. From the previous note, we know that y(r) = s, 
with r the radius of convergence of y. The aperiodicity of y implies that |y(¢)| < y(r) for all 
|C| such that |¢] = r and |¢| # r (see the Daffodil Lemma IV.1, p. 253). One then has for any 
such ¢ the property: |Gw(¢, y(¢))| < G(r, s) = 1, by monotonicity of G.,. But then by (40) 
above, this implies that y(C) is analytic at ¢. <q 

The solutions to the characteristic system (38) can be regarded as the intersection 
points of two curves, namely, 


G(r,s) — s =0, Gu (r,s) =1. 


Here are plots in the case of two functions G: the first one has nonnegative coefficients 
while the second one (corresponding to a counterexample of Canfield [77]) involves 
negative coefficients. Positivity of coefficients implies convexity properties that avoid 
pathological situations. 


Zz 
———- -1-y-y° SS 


(positive) (not positive) 


G(z,y) = 


VII. 4.2. Combinatorial applications. Many combinatorial classes, which ad- 
mit a recursive specification of the form VY = 6(Z, Y), with 6 a constructor of sorts, 
can be subjected to Theorem VII.3. The resulting structures are, to varying degrees, 
avatars of tree structures. In what follows, we describe a few instances where the 
square-root universality holds. 
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=== | === |===|German 
===|English 
===|Dutch 


===|===| Swedish 
===|Danish 


=== | ===] === | Portuguese 
===|Spanish 
===|Italian 
===|French 


===|Romanian 


===|Armenian 


FIGURE VII.10. A hierarchy placed on some of the modern Indoeuropean languages. 


— Hierarchies are trees enumerated by the number of their leaves (Exam- 
ples 11 and 12). 

— Trees with variable node sizes generalize simple families of trees; they oc- 
cur in particular as models of secondary structures in mathematical biology 
(Example 13). 

— Lattice paths with variable edge lengths are attached to some of the most 
classical objects of combinatorial theory (Note 18). 


EXAMPLE VII.11. Labelled hierarchies. The class £ of labelled hierarchies, as defined in 
Note II.18, p. 119, satisfies 


L=Z4+SETs2(L) LSee" 1 =, 


These occur in statistical classification theory: given a collection of n distinguished items, 
Ln is the number of ways of superimposing a nontrivial classification (cf Figure 10). Such 
abstract classifications usually have no planar structure, hence our modelling by a labelled set 
construction. 

In the notations of Definition VIL.4, p. 446, the basic function is G(z, w) = z+e”—1-—w, 
which is analytic in |z| < 00, |w| < co. The characteristic system is 


rte’—-l-s=s, e —1=1, 


which has a unique positive solution, s = log 2, r = 2 log 2—1, obtained by solving the second 
equation for s, then propagating the solution to get r. Thus, hierarchies belong to the smooth 
implicit-function schema, and, by Theorem VII.3, the EGF L(z) has a square-root singularity. 


One then finds mechanically 
2G cpp (logvaay 
n! 2Wa1n3 


(The unlabelled counterpart is the object of Note 22 below.) .. END OF EXAMPLE VII.11. 


> VIL.17. The degree profile of hierarchies. Combining BGF techniques and singularity anal- 
ysis, it is found that a random hierarchy of some large size n has on average about 0.57n nodes 
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Ga A fragment of RNA is, in first approximation, a tree-like 
structure with edges corresponding to bases pairs and 
ae “loops” corresponding to leaves. There are constraints 


length of edges (here between 1 and 4 base pairs). We 


és model such an RNA fragment as a planted tree P at- 
QOS tached to a binary tree (Y) with equations: 

P=AY, Y=AY?7+B, 

A=24244 7428 Ba=ztt 2428427. 


FIGURE VII.11. A simplified combinatorial model of RNA structures as considered by 
Waterman et al. [271, 430, 453]. 


ba) on the sizes of leaves (taken here between 4 and 7) and 


of degree 2, 0.18n nodes of degree 3, 0.04n nodes of degree 4, and less than 0.01n nodes of 
degree 5 or higher. <q 


EXAMPLE VII.12. Trees enumerated by leaves. For a (nonempty) set 2. C Z>o that does not 
contain 0,1, it makes sense to consider the class of labelled trees, 

C=Z+4+kalC), & = SEQ or SET. 
(A totally similar discussion can be conducted for unlabelled plane trees, with OGFs replacing 


EGFs.) These are rooted trees (plane or non-plane, respectively), with size determined by the 
number of leaves and with degrees constrained to lie in .. The EGF is then of the form 


C(z) =z+n(C(z)). 


This variety of trees includes the labelled hierarchies, which correspond to n(w) = e” —1—w. 

Assume for simplicity 7 here to be entire (possibly a polynomial). The base function is 
G(z,w) = z+n(w), and the characteristic system is s = r+7(s), 7'(s) = 1. Since 7(0) = 0 
and 7’ (++00) = ++00, this system always has a solution: 


san), r=s—n(s). 


Thus Theorem VII.3 applies, giving 
7 re 1 a 
ae = 4/=rn""(s), 
rar y= 4 grn'(s) 


and a full expansion can be obtained. .................00.005 END OF EXAMPLE VII.12. 


(41) [2"]C(z) ~ 


EXAMPLE VII.13. Trees with variable edge lengths and node sizes. Consider unlabelled plane 
trees in which nodes can be of different sizes: what is given is a set Q of ordered pairs (w,o), 
where a value (w,o) means that a node of degree w and size o is allowed. Simple varieties 
in their basic form correspond to 0 = 1; trees enumerated by leaves (including hierarchies) 
correspond to 0 € {0,1} with o = 1 iff w = 0. Figure 11 indicates the way such trees can 
model the self-bonding of single stranded nucleic acids like RNA, according to Waterman et 
al. [271, 430, 453]. Clearly an extremely large number of variations are possible. 
The fundamental equation in the case of a finite Qis 


Y(z) = P(z,Y(z)), P(z,w) := 3 zw, 
(w,c)EQ 


VII. 5. NONPLANE UNLABELLED TREES AND POLYA OPERATORS 453 


with P a polynomial. In the aperiodic case, there is invariably a formula of the form 


Vn ~ Ke A'n?!?, 


corresponding to the universal square-root singularity. ....... END OF EXAMPLE VII.13. 


> VII.18. Schréder numbers. Consider the class Y of unary-binary trees where unary nodes 
have size 2, while leaves and binary nodes have the usual size 1. The GF satisfies Y = z + 
2°Y +2Y7, so that 


_1l-z-vl-—6z4+2? 

- Qz ‘ 

We have D(z) = 14+ 22+ 627 + 2227 + 9024 + 3942" + ---, which is E/S A006318 
(“Large Schréder numbers”). By the bijective correspondence between trees and lattice paths, 
Yen+1 is in correspondence with excursions of length n made of steps (1,1), (2,0), (1, —1). 
Upon tilting by 45°, this is equivalent to paths connecting the lower left corner to the upper 
right corner of an (n x m) square that are made of horizontal, vertical, and diagonal steps, and 
never go under the main diagonal. The series S = $(1+.D) enumerates Schréder’s generalized 


parenthesis systems (Chapter I, p. 64): 5S := z + S?/(1 — S$), and the asymptotic formula 


1 1 —n+1/2 
Yop eS ~—— ( = 2) 
nm-1= 5 5 1 a 3 V2 


follows straightforwardly. <q 


Y(z) = zD(z’), D(z) 


VII.5. Nonplane unlabelled trees and Polya operators 


Essentially all the results obtained earlier for simple varieties of trees extend to 
the case of nonplane unlabelled trees. Pélya operators are central, and their treatment 
is typical of the asymptotic theory of unlabelled objects obeying symmetries (i.e., 
involving the unlabelled MSET, PSET, CYC constructions), as seen repeatedly in this 
book. 

Binary and general trees. We shall start the discussion by considering the enu- 
meration of two classes of non-plane trees following Polya [395, 397] and Otter [382], 
whose articles are important historic sources for the asymptotic theory of nonplane 
tree enumeration—a brief account also appears in [259]. (These authors used the 
more traditional method of Darboux instead of singularity analysis, but this distinc- 
tion is immaterial here as calculations develop under completely parallel lines under 
both theories.) The two classes under consideration are those of general and binary 
non-plane unlabelled trees. In both cases, there is a fairly direct reduction to the enu- 
meration of Cayley trees and of binary trees, which renders explicit several steps of 
the calculation. The trick is, as usual, to treat quantities f(z), f(z°),..., as “known” 
analytic quantities. 


Proposition VII.4 (Special non-plane unlabelled trees). Consider the two classes of 
non-plane unlabelled trees 
H=ZxMSET(H), W=Z x MSET{02}(W) 


respectively of the general and binary type. Then, with constants yy, An andyw, Aw 
given by Notes 20 and 21, one has 


Hp ~ —HLAn Wop ~» Tan 


A> Ww: 
273 2 7n3 
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PROOF. (i) General case. The OGF of nonplane unlabelled trees is the analytic 
solution to the functional equation 
1 2 


(42) (2) = zexp (SEL, AED ...). 


Let T be the solution to 
(43) T(z) = ze™™, 


that is to say, the Cayley function. The function H(z) has a radius of convergence p 
strictly less than | as its coefficients dominate those of T(z), the radius of convergence 
of the latter being exactly e~! + 0.367. The radius p cannot be 0 since the number of 
trees is bounded from above by the number of plane trees whose OGF has radius i. 
Thus, one has $ < p<en!. 
Rewriting the defining equation of H(z) as 
H(z? He 
H(z) = Ce#@) with c= cep (AE MED...) 

we observe that ¢ = ¢(z) is analytic for |z| < p'/?, that is to say in a disk that properly 
contains the disk of convergence of H(z). We may thus rewrite H(z) as 


H(z) =T(¢(z)). 


Since ¢(z) is analytic at z = p, a singular expansion of H(z) near z = p results from 
composing the singular expansion of T at e~! with the analytic expansion of ¢ at p. 
In this way, we get: 


(44) H(z)=1-7(1 +o a7 = V2¢06(0) 


Thus, 


2/73 


(it) Binary case. Consider the functional equation 


1 1 
(45) fz) =2+ 5f(2)’ + 5fl2’). 


This enumerates non-plane binary trees with size defined as the number of external 
nodes, so that W(z) = 4 f(z”). Thus, it suffices to analyse [z”] f(z), which dispenses 
us from dealing with periodicity phenomena. 

The OGF f(z) has a radius of convergence p that is at least + (since there are 
fewer non-plane trees than plane ones). It is also at most 4 as results from comparison 
of f with the solution to the equation g = z+ 4 g°. We may then proceed as before: 
treat the term 4 f(z?) as a function analytic in |z| < p'/?, as though it were known, 


then solve. To this effect, set 


(2) = 24+ 54), 
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which exists in |z| < p'/?. Then, the equation (45) becomes a plain quadratic equa- 


tion, f = ¢ + $f?, with solution 


f(z) =1-— V1— 2¢(z). 


The singularity p is the smallest positive solution of ¢(p) = 4. The singular expansion 
of f is obtained by composing the analytic expansion of ¢ at p with /1 — 2¢. The 
usual square-root singularity results: 


flze)~1l—-yJ1—2/p, y= V2pC"(p). 


This induces the p~"n~%/? form for the coefficients [z”] f(z) = [22"-']W(z). 


> VIL19. Full asymptotic expansions for Hy, W2n—1. They can be determined since the OGFs 
admit complete asymptotic expansions in powers of ,/1 — z/p. dq 


The argument used in the proof of the proposition may seem partly nonconstruc- 
tive. However, numerically, the values of » and C can be determined to great ac- 
curacy. See the notes below as well as Finch’s section on “Otter’s tree enumeration 
constants” [165, Sec. 5.6]. 
> VII.20. Numerical evaluation of constants I. Here is an unoptimized procedure controlled 
by a parameter m > 0 for general non-plane unlabelled trees. 


Procedure Get_value_of_p(m : integer); 
1. Set up a procedure to compute and memorize the H,, on demand; 
(this can be based on recurrence relations implied by H’(z); see [373]) 


2. Define f!™!(z) := yt Anz”; 
3. Define ¢!""l(z) := zexp os ee ifm (2")); 


4, Solve numerically ¢!""! (a) = e~1 for a € (0, 1) to max(m, 10) digits of accuracy; 
5. Return x as an approximation to p. 


For instance, a conservative estimate of the accuracy attained form = 0,10,...,50 Gina few 
billion machine instructions) is: 


Accuracy appears to be a little better than 10-”/?. This yields to 25D: 


p = 0.3383218568992076951961126, Ai = p + + 2.955765285651994974714818, 
YH = 1.559490020374640885542206. 


The formula of the Proposition estimates Hoo with a relative error of 1073. <J 


> VII.21. Numerical evaluation of constants II. The procedure of the previous note adapts 
easily to give: 


p = 0.4026975036714412909690453, Aw = p—' = 2.483253536172636858562289, 
yw = 1.130033716398972007144137. 


The formula of the Proposition estimates [z'°°] f(z) with a relative error of 7 - 107°. <q 


The two results, general and binary, are thus obtained by a modification of the 
method used for simple varieties of trees, upon treating the Polya operator part as an 
analytic variant of the corresponding equations of simple varieties of trees. 


456 VII. APPLICATIONS OF SINGULARITY ANALYSIS 


Alkanes, alcohols, and degree restrictions. The previous two examples sug- 
gest that a general theory is possible, for varieties of unlabelled non-plane trees, 
T = ZMSETQ(T), for some 2 C Zso. First, we examine the case of special regular 
trees defined by 2 = {0,3}, which, when viewed as alkanes and alcohols, are of rele- 
vance to combinatorial chemistry (Example 14). Indeed, the problem of enumerating 
isomers of such chemical compounds has been at the origin of Pélya’s foundational 
works [395, 397]. Then, we extend the method to the general situation of trees with 
degrees constrained to some finite set 2. (Proposition VII.4). 


EXAMPLE VII.14. Nonplane trees and alkanes. In chemistry, carbon atoms (C) are known to 
have valency 4 while hydrogen (/7) has valency 1. Alkanes, also known as paraffins (Figure 12), 
are are acyclic molecules formed of carbon and hydrogen atoms according to this rule and 
without multiple bonds; they are thus of the type C;,H2n+2. In combinatorial terms, we are 
talking of unrooted trees with (total) node degrees in {1,4}. The rooted version of these trees 
are determined by the fact that a root is chosen and (out)degrees of nodes lie in the set Q = 
{0,3}; these are rooted ternary trees and they correspond to alcohols (with the OH group 
marking one of the carbon atoms). 

Alcohols (A) are the simplest to enumerate as they are rooted trees. The OGF starts as 
(EIS 4000598) 


A) Siteopt te? oe" ae? +8 172" 4302" beoe pes, 
with size being taken here as the number of internal nodes. The specification is 
A = {e} + Z MSET3(A). 


(Equivalently At := A \ {e} satisfies AT = ZMSETo0,1,2,3(A*).) This implies that A(z) 
satisfies the functional equation: 


AGE i ce ($46) af 5 Ale) A(2*) a FA(2)’) . 


In order to apply Theorem VII.3, introduce the function 


G(z,w) =1+z ($40) + 5 Ale?) + zu") : 


which exists in |z| < |p|‘/? and |w| < 00, with p the (yet unknown) radius of convergence 
of A. Like before, the Pélya terms A(z”), A(z*) are teated as known functions. By methods 
similar to those used in the analysis of binary and general trees (Subsection VII. 5), we find that 
the characteristic system admits a solution, 


r = 0.3551817423143773928, 5s = 2.1174207009536310225, 


so that p = r and y(p) = s. Thus the growth of the number of alcohols is of the form 
K+ 2.81546" n 9/2, 
Let B(z) be the OGF of alkanes (EIS A000602), which are unrooted trees: 


Biz) =1424 2742242244329 45294927 418 2°35 29+ 752194... 


For instance, Bg = 5 because there are 5 isomers of hexane, C¢H14, for which chemists had to 
develop a nomenclature system, interestingly enough based on a diameter of the tree: 
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FIGURE VII.12. A few examples of alkanes (CH4, C2H¢, C3Hg) and an alcohol. 


Hexane 3-Methylpentane 2-Methylpentane 
CH; CH; 


CH ,—CH »—CH ,—CH ,»—CH,—CH3; | CH;—CH,—CH—CH,—CH; | CH3,—CH —CH,—CH,—CH, 


2,3-Dimethylbutane 2,2-Dimethylbutane 
CH; 


CH; CH; CH ;—C —CH »—CH, 


CH;—CH —CH —CH3 CH; 


The number of structurally different alkanes can then be found by an adaptation of the 
dissimilarity formula (Equation (48) and Note 25). This problem has served as a powerful 
motivation for the enumeration of graphical trees and its fascinating history goes back to Cayley. 
(See Rains and Sloane’s article [406] and [397]). The asymptotic formula of (unrooted) alkanes 
is of the form A"n~°/? term, which represents roughly a proportion 1/n of the number of 
(rooted) alcohols: -.se2:5.2 55 cone sabies sala ngnd Oe nde athens END OF EXAMPLE VII.14. 


The pattern of analysis should by now be clear, and we state: 


Theorem VII.4 (Nonplane unlabelled trees). Let Q 5 0 be a finite subset of Z>o 
and consider the variety V of (rooted) nonplane unlabelled trees with outdegrees of 
nodes in Q. Assume aperiodicity (gcd(Q) = 1) and the condition that Q contains at 
least one element larger than 1. Then the number of trees of size n in V satisfies an 
asymptotic formula: 


V,~C-A™n 3/2, 


PROOF. The argument given for alcohols is transposed verbatim. Only the existence 
of a root of the characteristic system needs to be established. 

The radius of convergence of V(z) is a priori < 1. The fact that is strictly less 
than | is established by means of an exponential lower bound, V,, > B”, for some 
B > 1 and infinitely many values of n. To obtain this exponential variability, first 
choose an mg such that V,,, > 1, then build a perfect d-ary tree (for some d € Q, 
d # 0,1) tree of height h, and finally graft freely subtrees of size no at n/(4no) 
of the leaves of the perfect tree. Choosing d such that d” > n/(4no) yields the 
lower bound. That the radius of convergence is nonzero results from the upper bound 
provided by corresponding plane trees whose growth is at least exponential. Thus, one 
hasO< p<. 
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By the translation of multisets of bounded cardinality, the function G is polyno- 
mial in finitely many of the quantities {V(z), V(z?), ...}. Thus the function G(z, w) 
constructed like in the case of alcohols converges in |z| < p'/?, |w| < oo. As 
Zo pt, we must have 7 := V(p) finite, since otherwise, there would be a con- 
tradiction in orders of growth in the nonlinear equation V(z) = --- + ---V(z)¢--- 
as z — p. Thus (p,7) satisfies r = G(p,7). For the derivative, one must have 
Gw(p,T) = 1 since: (i) a smaller value would mean that V is analytic at p (by the 
Implicit Function Theorem); (27) a larger value would mean that a singularity has 
been encountered earlier (by the usual argument on failure of the Implicit Function 
Theorem). Thus, Theorem VII.3 on positive implicit functions is applicable. 

A large number of variations are clearly possible as evidenced by the title of an 
article [258] published by Harary, Robinson, and Schwenk in 1975, namely, “Twenty- 
step algorithm for determining the asymptotic number of trees of various species”’. 


> VII.22. Unlabelled hierarchies. The class H of unlabelled hierarchies is specified by H = 
Z + MSET>s2(H) (see Note 42, p. 68). One has 


i] x <n . 
Hn ~ ; = 0.29224. 
2 a7n3 is . 
(Compare with the labelled case of Example 11.) What is the asymptotic proportion of internal 
nodes of degree r, for a fixed r > 0? dq 


> VII.23. Trees with prime degrees and the BBY theory. Bell, Burris, and Yeats [25] develop 
a general theory meant to account for the fact that, in their words, “almost any family of trees 
defined by a recursive equation that is nonlinear [...] lead[s] to an asymptotic law of the 
Polya form t(n) ~ C po?n-3/ 2” Their most general result (Th. 75) implies for instance that 
the number of nonplane unlabelled trees whose node degrees are restricted to be prime numbers 
admits such a Polya form (see also Note 6, p. 436). <q 


Unlabelled functional graphs (mapping patterns). Unlabelled functional graphs, 
also known as “mapping patterns” (F) are unlabelled digraphs in which each vertex 
has outdegree equal to 1. Equivalently, they can be regarded as multisets of compo- 
nents (£) that are cycles of nonplane unlabelled trees (71), 


F=MSeET(L); £=Cyc(H); H=Z x MSET(H), 


a specification that entirely parallels that of mappings in Equation (34), p. 443. 
The OGF H(z) has a square-root singularity by virtue of (44) above, with addi- 
tionally H(p) = 1. The translation of the unlabelled cycle construction, 


] 1 
L(z) = yy PO) tog 1— Aes’ 


implies that L(z) is logarithmic, and F(z) has a singularity of type 1//Z where 
Z := 1-—2/p. Thus, unlabelled functional graphs constitute an exp-log structure 
with & = 3. The number of unlabelled functional graphs thus grows like C’ pon 2 
and the mean number of components in a random functional graph is ~ 4 log n, like 
for the labelled mapping counterpart. See [357] for more on this topic. 
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> VIIL.24. An alternative form of F(z). Arithmetical simplifications associated with the Euler 


totient function yield: 
F(2)=]] (1 és H(z*)) 


k=1 
A similar form applies generally to multisets of unlabelled cycles. <q 


-1 


Unrooted trees. All the trees considered so far have been rooted and this version 
is the one most useful in applications. An unrooted tree® is by definition a connected 
acyclic (undirected) graph. In that case, the tree is clearly non-plane and no special 
root node is distinguished. 

The counting of the class / of unrooted labelled trees is easy: there are plainly 
U, = n”~? of these, since each node is distinguished by its label, which entails that 
nUy, = Tn, with T, = n”~+ by Cayley’s formula. Also, the EGF U(z) satisfies 


(46) u@)= [rw =Te -57e), 
) y 2 

as already seen when we discussed labelled graphs in Chapter IT. 

For unrooted unlabelled trees, symmetries are in the way and a tree can be rooted 
in a number of ways that depends on its shape. For instance of star graph leads to a 
number of different rooted trees that equals 2 (choose either the center or one of the 
peripheral nodes), while a line graph gives rise to [n/2] structurally different rooted 
trees. With 7 the class of rooted unlabelled trees and Z the class of unrooted trees, 
we have at this stage only the general inequality 


In S An < In. 


A table of values of the ratio H,,/I,, suggests that the answer is closer to the upper 
bound: 


n 10 20 30 40 50 60 


() Hy/In | 6.78 15.58 23.89 32.15 40.39 48.62 


The solution is provided by a famous exact formula due to Otter (Note 25): 


(48) I(z) = H(z) - 5 (H(z)? — A(z7)), 


which gives in particular (EJS A000055) I(z) = z+ 227 + 23 4+224+32°+62%+ 
1127 +23 25 +.--. Given (48), it is child’s play to determine the singular expansion 
of J knowing that of H. The radius of convergence of J is the same as that of H as 
the term H(z?) only introduces exponentially small coefficients. Thus, it suffices to 
analyse H — $H?: 
i ee Heth (1-2) 
z)— =H(z)? ~ 62Z + 63Z°/* +O (27), ea el 
2 2 p 

What is noticeable is the cancellation in coefficients for the term Z!/? (since 1 — x — 
$(1— 2)? = $ + O(2?)), so that Z°/? is the actual singularity type of I. Clearly, 
the constant 63 is computable from the first four terms in the singular expansion of H 


8Unrooted trees are also called sometimes free trees. 


460 VII. APPLICATIONS OF SINGULARITY ANALYSIS 


at p. Then singularity analysis yields: The number of unrooted trees of size n satisfies 
the formula 


(49) In ~ ease Tn, ~ 0.5349496061 - 2.9955765856"n°/?. 


The numerical values are from [165] and the result is Otter’s original [382]: an un- 
rooted tree of size n gives rise to about different 0.8n rooted trees on average. (The 
formula (49) corresponds to an error slightly under 10~? for n = 100.) 


> VII.25. Dissimilarity theorem for trees. | Here is how combinatorics justifies (48), follow- 
ing [39, §4.1]. Let Z*® (and Z*°) be the class of unrooted trees with one vertex (respectively one 
edge) distinguished. We have Z° ~ H (rooted trees) and Z°* & SET2(H). The combinatorial 
isomorphism claimed is 


(50) GOs Gee TT By: 


A diameter of an unrooted tree is a simple path of maximal length. If the length of any diameter 
is even, call “centre” its mid-point; otherwise, call “bicentre” its mid-edge. (For each tree, there 
is either one centre or one bicentre.) The left-hand side of (50) corresponds to trees that are 
pointed either at a vertex (Z°) or an edge (Z**). The term Z on the right-hand side corresponds 
to cases where the pointing happens to coincide with the canonical centre or bicentre. If there 
is not coincidence, then, an ordered pair of trees results from a suitable surgery of the pointed 
tree. [Hint: cut in some canonical way near the pointed vertex or edge.] <q 


VII. 6. Irreducible context-free structures 


In this section, we discuss an important variety of context-free classes, one that 
satisfies the universal law of square-root singularities, attached to counting sequences 
that are of the asymptotic form A”n~*/?, (General algebraic functions are treated in 
the next section.) 


VII. 6.1. Context-free specifications and the irreducibility schema. We start 
from the notion of a context-free class introduced in Subsection I. 5.4, p. 75. A class 
is context-free if it is the first component of a system of combinatorial equations 


VY = $1(Z2,V1,.--; Yr) 
(51) ae 
yy = Br (Z,V,.--, Nr), 

where each §; is a construction that only involves the combinatorial constructions of 
disjoint union and cartesian product. (This repeats Equation (67) of Chapter I, p. 75.) 
As seen in Subsection I. 5.4, binary and general trees, triangulations, as well a Dyck 
and Lukasiewicz languages are typical instances of context-freee classes. 

As a consequence of the symbolic rules of Chapter I, the OGF of a context-free 


class C is the first component (C(z) = yi(z)) of the solution of a polynomial system 
of equations of the form 


yi(z) = ®1(z,y1(z),---,¥r(2)) 
(52) : : ; 
Yr (Zz) = ®,(z, yi(z),---, Yr (Z)), 
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where the ®; are polynomials. By elimination (Cf APPENDIX B: Algebraic elimina- 
tion, p. 685), it is always possible to find a bivariate polynomial P(z, y) such that 


(53) P(z,C(z)) =0, 


and C'(z) is an algebraic function. (Algebraic functions are discussed in all generality 
in the next section.) 

The case of linear systems has been dealt with in Chapter V, when examining 
the transfer matrix method. Accordingly, we only need to consider here nonlinear 
systems (of equations or specifications) defined by the condition that at least one ®; 
in (52) is a polynomial of degree 2 or more, corresponding to the fact that at least one 
of the constructions §; in (51) involves at least a product Vie. 

Definition VII.5. A well-founded context-free specification (51) is said to belong to 
the irreducible context-free schema if it is nonlinear and its dependency graph is 
strongly connected. It is said to be aperiodic if all the y;(z) are aperiodic’. 
Theorem VIL.5 (Irreducible context-free schema). A class C that belongs to the irre- 
ducible context-free schema has a generating function that has a square-root singu- 
larity at its radius of convergence p: 


C)=r-mfr-2 +0(1-2), 


for computable algebraic numbers p,t,y. If, in addition, C(z) is aperiodic, then the 
dominant singularity is unique and the counting sequence satisfies 

a =n 
(54) Cn se. 

This theorem is none other than a transcription, at the combinatorial level, of a 
remarkable analytic statement, Theorem VII.6, due to Drmota, Lalley, and Woods, 
which is proved below (p. 466), is slightly stronger, and is of independent interest. 

Computability issues. There are two possible approaches to the calculation of the 
quantities that appear in (54), one based on the original system (52), the other based 
on the single equation (53) that results from elimination. 

— System: From the proof of Theorem VIL.6, it suffices to solve in positive real 
numbers the polynomial system of m + 1 equations in the m + 1 unknowns 


Ps T15+++5Tms 
(55) 71 = ®1(p,71,---,7m), «++ 5 Tm =POm(P,T1,---,Tm), I(P,T1,---,Tm) =, 
where J is the Jacobian determinant (6;,; = [i = 7] represents Kronecker’s 
symbol): 
O 
(56) I (Bey isawes Hen) FOE On = By, ile Y1y---5Ym) |} - 
j 


In that case, p is the common radius of convergence of all the y;(z) and 
T) = y;(p). The constant y = +; is acomponent solution of a linear system 
of equations with coefficients in the field generated by p,7;, which can be 


Tn the usual sense that the span of the coefficient sequence is equal to 1. For an irreducible system, 
all the y; are aperiodic if and only if at least one of the y; is aperiodic. 
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obtained by the method of undetermined coefficients, knowing that y; is of 
the form 


(57) yj(Z) ~ 7] —5V1— 2/p, Zz p. 


— Equation: The general techniques are described in the next section, §VII. 7. 
They give rise to the following algorithm: (7) determine the exceptional set 
and isolate the dominant positive singularity; (i2) identify the coefficients in 
the singular (Puiseux) expansion, knowing a priori that the singularity is of 
the square-root type. 


Whatever the adopted strategy is, symbolic algebra system proves invaluable in per- 
forming the required algebraic eliminations. 


> VIIL.26. Catalan and Jacobi. For the Catalan GF, defined by y = 1+ zy’, the system of 
Equation (55) instantiates to 


T= 1, pr? =0, 1—2pr = 0, 


giving back as expected: p = +, T=2. dq 


VII. 6.2. Combinatorial applications. Random walks on free groups [323], di- 
rected walks in the plane [21, 320, 323] (see also p. 482 below), coloured trees [503], 
and boolean expression trees [85] are only some of the many combinatorial structures 
resorting to the irreducible context-free schema. Stanley presents in his book [449, 
Ch. 6] several examples of algebraic GFs, and an inspiring survey is provided by 
Bousquet-Mélou in [68]. We limit ourselves here to a brief discussion of non-crossing 
configurations. 


EXAMPLE VII.15. Non-crossing configurations. Context-free descriptions can model natu- 
rally very diverse sorts of objects including particular topological-geometric configurations— 
we examine here non-crossing planar configurations. The problems considered have their origin 
in combinatorial musings of the Rev. T.P. Kirkman in 1857 and were revisited in 1974 by Domb 
and Barett [132] for the purpose of investigating certain perturbative expansions of statistical 
physics. Our presentation follows closely the synthesis offered by Flajolet and Noy in [196]. 

Consider, for each value of n, graphs built on vertices that are at the n complex roots 
of unity, numbered 0,...,n — 1. A non-crossing graph is a graph such that no two of its 
edges cross. One can also define connected non-crossing graphs, non-crossing forests (acyclic 
graphs), and non-crossing trees (acyclic connected graphs); see Figure 13. Note that the various 
graphs considered can always be considered as rooted in some canonical way (e.g., at the vertex 
of smallest index) . 


Trees. A non-crossing tree is rooted at 0. To the root vertex is attached an ordered collec- 
tion of vertices, each of which has an end-node v that is the common root of two non-crossing 
trees, one on the left of the edge (0, 1”) the other on the right of (0,1). Let J denote the class 
of trees and U/ denote the class of trees whose root has been severed. With o denoting a generic 
node, we have 


T=0xUu, U=SEQU xox), 
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te! 


(tree) 


(connected graph) 


Configuration / OGF 


ie 
: es 


(forest) 
N 
(graph) 


Coefficients (exact / asymptotic) 


Trees (EIS: A001764) 


T?-2T+2=0 


Forests (EIS: A054727) 


Fo4(2—27-3)F?+(2+3)F—1=0 


Connected graphs (EJS: 4007297) 


C® + C? — 32C' + 227 =0 


Graphs (E7S: A054726) 


G? + (227 —3z-2)G4+3z+1=0 
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FIGURE VII.13. (Top) Non-crossing graphs: a tree, a forest, a connected graph, and 
a general graph. (Bottom) The enumeration of non-crossing configurations by algebraic 


functions. 
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which corresponds graphically to the “butterfly decomposition”: 


T= @ U= 
U 


The reduction to a pure context-free form is obtained by noticing that ¢ = SEQ(V) is 
equivalent to’/ = 1+ UY: a specification and the associated polynomial system are then 


(58) {JT = 2U, U=1+UY, V = ZU} {T = 2U, U =1+4UV, V = 2U"}. 


This system relating U and V is irreducible (then, T is immediately obtained from U), and 
aperiodicity is obvious from the first few values of the coefficients. The Jacobian of the U, V— 
system, cf (56) (obtained by z — p, U — v, V — v), is 


l-v v 


ee ee 2 
Babe =1-—v—2pv’. 


Thus, the system (55) giving the singularity of U, V is 


{vu =1+up, v=pv’,l—v 2pv” = OF, 


whose positive solution is p x, ) 3, V 3. The complete asymptotic formula is 
displayed in Figure 13. (In a simple case like this, we have more: T' satisfies T? — z+ 2? = 0, 
which by Lagrange inversion gives T;, = so Crt ) 


Forests. A (non-crossing) forest is a non-crossing graph that is acyclic. In the present con- 
text, it is not possible to express forests simply as sequences of trees, because of the geometry 
of the problem. 

Starting conventionally from the root vertex 0 and following all connected edges defines a 
“backbone” tree. To the left of every vertex of the tree, a forest may be placed. There results 
the decomposition (expressed directly in terms of OGFs), 


(59) FH=14T{[z zF, 


where T is the OGF of trees and F' is the OGF of forests. In (59), the term T'[z +> zF'] denotes 
a functional composition. A context-free specification in standard form results mechanically 
from (58) upon replacing z by zF’, namely 


(60) F=14T, T=2zFU, U=1+4+UV, V =2zFU’. 


This system is irreducible and aperiodic, so that the asymptotic shape of F, is of the form 
yw'n 3/ 2__as predicted by Theorem VIL.5. (The values of constants are worked out in Exam- 
ple 17 by means of the equational approach.) 

Graphs. Similar constructions (see [196]) give the OGFs of connected and general graphs, 
as summarized in Figure 13. Note the common shape of the asymptotic estimates and also the 
fact that binomial expressions are available in each case. ..... END OF EXAMPLE VII.15. 


VII. 6.3. Analysis of irreducible polynomial systems. The analytic engine be- 
hind Theorem VII.5 is a fundamental result, the “Drmota-Lalley- Woods” (DLW) The- 
orem, due to independent research by several authors: Drmota [135] developed a ver- 
sion of the theorem in the course of studies relative to limit laws in various families 
of trees defined by context-free grammars; Woods [503], motivated by questions of 
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Boolean complexity and finite model theory, gave a form expressed in terms of colour- 
ing rules for trees; finally, Lalley [323] came across a similarly general result when 
quantifying return probabilities for random walks on groups. Drmota and Lalley show 
how to pull out limit Gaussian laws for simple parameters (by a perturbative analysis; 
see Chapter IX); Woods shows how to deduce estimates of coefficients even in some 
periodic or non-irreducible cases. 

In the treatment that follows we start from a polynomial system of equations, 


{yj = 8) (2, y1,---,Ym)} , j=l,...,m, 


in accordance with the notations adopted at the beginning of the section. We only con- 
sider nonlinear systems defined by the fact that at least one polynomial ® ; is nonlinear 
in some of the indeterminates y1,..., Ym. 
First, for combinatorial reasons, we define several possible attributes of a polyno- 
mial system. 
— Algebraic positivity (or a-positivity). A polynomial system is said to be a- 
positive if all the component polynomials ®; have nonnegative coefficients. 
Next, we want to restrict consideration to systems that determine a unique solution 
vector (y1,---,Ym) € (Cz])"”. Define the z-valuation val(¥) of a vector 7 € C[z]™ 
as the minimum over all j’s of the individual valuations!” val(y;). The distance be- 
tween two vectors is defined as usual by d(g, 7’) = 2~ ¥*Y-¥). Then, one has: 


— Algebraic properness (or a-properness). A polynomial system is said to be 
a-proper if it satisfies a Lipschitz condition 


d(®(¥), ®(y’)) < Kd(¥, 7’) for some K < 1. 


In that case, the transformation ® is a contraction on the complete metric space of 
formal power series and, by the general fixed point theorem, the equation 7 = ®(¥) 
admits a unique solution. In passing, this solution may be obtained by the iterative 
scheme, 


f = (0,...,0), PV = Oy), F= lim f™. 


The key notion is irreducibility. To a polynomial system, 7 = ®(y), associate its 
dependency graph defined as a graph whose vertices are the numbers 1,...,m and 
the edges ending at a vertex j are k — j, if y; figures in a monomial of ®,(j). (This 
notion is reminiscent of the one already introduced for linear system on page 329.) 


— Algebraic irreducibility (or a-irreducibility). A polynomial system is said to 
be a-irreducible if its dependency graph is strongly connected. 
Finally, one needs a technical notion of periodicity to dispose of cases like 
1 
y(z) = 5 (1- v1— 42) =zg+72+2294..., 
z 


(the OGF of complete binary trees) where coefficients are only nonzero for certain 
residue classes of their index. 


101 et f= Ree fnz” with fg A 0; the valuation of f is by definition val(f) = 6. 
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— Algebraic aperiodicity (or a-aperiodicity). A power series is said to be aperi- 
odic if it contains three monomials (with nonzero coefficients), 2°! , 2°, z°3, 
such that e2 — e; and e3 — e; are relatively prime. A proper polynomial sys- 
tem is said to be aperiodic if each of its component solutions y; is aperiodic. 


Theorem VII.6 (Irreducible positive polynomial systems, DLW Theorem). Consider 
a nonlinear polynomial system ¥ = ®(¥) that is a-proper, a-positive, and a-irreducible. 
In that case, all component solutions y; have the same radius of convergence p < 00. 
Then, there exist functions h; analytic at the origin such that, in a neighbourhood 


of p: 
(61) yj = hj ( T—z/p) ; 


In addition, all other dominant singularities are of the form pw with w a root 
of unity. If furthermore the system is a-aperiodic, all y; have p as unique dominant 
singularity. In that case, the coefficients admit a complete asymptotic expansion of the 
form 


(62) [z2"]y; (z) A oe Se dyn 3/2-k 
k>0 


PROOF. The proof consists in gathering by stages consequences of the assumptions. 
It is essentially based on a close examination of “failure” of the mutivariate implicit 
function theorem and the way this leads to square-root singularities. 


(a) As a preliminary observation, we note that each component solution y,; is 
an algebraic function that has a nonzero radius of convergence. (This can be checked 
directly by the method of majorant series or as a consequence of a multivariate version 
of the implicit function theorem.) 


(b) Properness together with the positivity of the system implies that each y;(z) 
has nonnegative coefficients in its expansion at 0, since it is a formal limit of approx- 
imants that have nonnegative coefficients. In particular, each power series y; has a 
certain nonzero radius of convergence p;. Also, by positivity, p; is a singularity of y; 
(by virtue of Pringsheim’s theorem). From the known nature of singularities of al- 
gebraic functions (e.g., the Newton-Puiseux Theorem, recalled p. 474 below), there 
must exist some order R > 0 such that each Rth derivative Oy; (z) becomes infinite 
asz— p;. 

We establish now that p; = --- = Pm. In effect, differentiation of the equations 
composing the system implies that a derivative of arbitrary order r, OF y;(z), is a 
linear form in other derivatives 07 y;(z) of the same order (and a polynomial form in 
lower order derivatives); also the linear combination and the polynomial form have 
nonnegative coefficients. Assume a contrario that the radii were not all equal, say 
pi = ++: = ps, with the other radii p,41,... being strictly greater. Consider the 
system differentiated a sufficiently large number of times, R. Then, as z — 1, we 
must have OF y; tending to infinity for 7 < s. On the other hand, the quantities y,+1, 
etc., being analytic, their Rth derivatives that are analytic as well must tend to finite 
limits. In other words, because of the irreducibility assumption (and again positivity), 
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infinity has to propagate and we have reached a contradiction. Thus: all the y; have 
the same radius of convergence. We let p denote this common value. 


(c1) The key step consists in establishing the existence of a square-root singularity 
at the common singularity p. Consider first the scalar case, that is 


where ¢ is assumed to be a nonlinear polynomial in y and have nonnegative coeffi- 
cients. This case resorts to the smooth implict function schema, whose argument we 
briefly revisit under our present perspective. 

Let y(z) be the unique branch of the algebraic function that is analytic at 0. Com- 
parison of the asymptotic orders in y inside the equality y = $(z, y) shows that (by 
nonlinearity) we cannot have y — oo when z tends to a finite limit. Let now p be the 
radius of convergence of y(z). Since y(z) is necessarily finite at its singularity p, we 
set T = y(p) and note that, by continuity, 7 — d(p,7) = 0. 

By the implicit function theorem, a solution (zo, yo) of (63) can be continued 
analytically as (z, yo(z)) in the vicinity of zo as long as the derivative with respect 
to y (a simplified Jacobian), 


J (20, Yo) == 1— $y (20, yo) 


remains nonzero. The quantity p being a singularity, we must thus have J(p,7) = 0. 
On the other hand, the second derivative —¢y,,, is nonzero at (p,7) (by nonlinearity 
and positivity). Then, the local expansion of the defining equation (63) at (p, 7) binds 
(z, y) locally by 


1 
—(2— p)4.(0,7) — SY — 7)? Bjyl0,7) +++ = 0, 
implying the singular expansion 


y—T=—-y(1-2/p)'? +. 
This establishes the first part of the assertion in the scalar case. 


(cz) In the multivariate case, we graft an ingenious argument [323] that is based 
on a linearized version of the system to which Perron-Frobenius theory is applicable. 
First, irreducibility implies that any component solution y; depends positively and 
nonlinearly on itself (by possibly iterating ®), so that a contradiction in asymptotic 
comportments would result, if we suppose that any y, tends to infinity. Each y;(z) 
remains finite at the positive dominant singularity p. 

Now, the multivariate version of the implicit function theorem grants us locally 
the analytic continuation of any solution yj, y2,...,Ym at 2 provided there is no 
vanishing of the Jacobian determinant 


O 
J (20; Y1; ree +Um) — det (5, —_ Bike 


®;(20, 41; ais 8 “tm)) ’ 
Yj 


where 6;,; is Kronecker’s symbol. Thus, we must have 


(64) J(P,T1,-+-;Tm) =0 where 7; := y;(p). 
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The next argument (we follow Lalley [323]) uses Perron-Frobenius theory and 
linear algebra. Consider the Jacobian matrix 


6) 
K(20,Y1,- oe Ym) = (Fo.0.1n.- tm) ’ 
J 


which represents the “linear part” of ®. For z, y1,.-.., Ym all nonnegative, the matrix 
i has positive entries (by positivity of ©) so that it is amenable to Perron-Frobenius 
theory. In particular it has a positive eigenvalue \(z, y1,.--, Ym) that dominates all 
the other in modulus. The quantity 


AZ) = Ayr (Z)s +++ Ym(2Z)) 
is increasing as it is an increasing function of the matrix entries that themselves in- 
crease with z for z > 0. Z - 

We propose to prove that A(p) = 1, In effect, A(~) < 1 is excluded since other- 
wise (I— KC) would be invertible at z = p and this would imply J 4 0, thereby contra- 
dicting the singular character of the y;(z) at p. Assume a contrario X(p) > 1 inorder 
to exclude the other case. Then, by the increasing property, there would exists p; < p 
such that d(p1) = 1. Let v be a left eigenvector of K(p1, y1(p1),---, Ym(p1)) cor- 
responding to the eigenvalue X(p1). Perron-Frobenius theory guarantees that such a 
vector v1 has all its coefficients that are positive. Then, upon multiplying on the left 
by v; the column vectors corresponding to y and ®(y) (which are equal), one gets an 
identity; this derived identity upon expanding near /; gives 


(65) A(z — pi) =— > Bis (vil) — yilpr))(y5(z) — ys(p1)) +++, 


where --- hides lower order terms and the coefficients A, B;,; are nonnegative with 
A > 0. There is a contradiction in the orders of growth if each y; is assumed to be 
analytic at p; since the left side of (65) is of exact order (z — p1) while the right side 
is at least as small as to (z — p;)?. Thus, we must have Xp) = land Xa) < 1 for 
x € (0, p). 
A calculation similar to (65) but with p; replaced by p shows finally that, if 
yi(z) — wile) ~ (eo — 2)", 

then consistency of asymptotic expansions implies 2a = 1, that isa = 3. We have 
thus proved: All the component solutions y;(z) have a square-root singularity at p. 
(The existence of a complete expansion in powers of (p — z)l/ ? results from a refine- 
ment of this argument.) The proof of the general case (61) is thus complete. 

(d) In the aperiodic case, we first observe that each y;(z) cannot assume an 
infinite value on its circle of convergence |z| = p, since this would contradict the 
boundedness of |y;(z)| in the open disk |z| < p (where y;(p) serves as an upper- 
bound). Consequently, by singularity analysis, the Taylor coefficients of any y;(z) are 
O(n~'~") for some 7 > 1 and the series representing y; at the origin converges on 
|z| = p. 

For the rest of the argument, we observe that if 7 = ®(z, 7), then 7 = 8 (z, 7) 
where the superscript denotes iteration of the transformation ® in the variables 7 = 
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(y1,---;Ym). By irreducibility, (™) is such that each of its component polynomials 
involves all the variables. 

Assume that there would exists a singularity p* of some y;(z) on |z| = p. The 
triangle inequality yields |y;(p*)| < y,;() where strictness is related to the general 
aperiodicity argument encountered at several other places in this book. But then, the 
modified Jacobian matrix K‘") of ®'”) taken at the y;(p*) has entries dominated 
strictly by the entries of AK‘) taken at the y;(p). There results that the dominant 
eigenvalue of K‘"”(z, ¥;(p*)) must be strictly less than 1. But this would imply that 
I — K\™ (z,%;(p*)) is intervertible so that the y;(z) would be analytic at p*. A 
contradiction has been reached: p is the sole dominant singularity of each y; and this 
concludes the argument. 

Many extensions of the DLW Theorem are possible, as indicated by the notes and 
references below—the underlying arguments are powerful, versatile, and highly gen- 
eral. Consequences regarding limit distributions, as obtained by Drmota and Lalley, 
are further explored in Chapter IX. 
> VII.27. Analytic systems. Drmota [135] has shown that the conclusions of the DLW Theo- 
rem regarding universality of the square-root singularity holds more generally for ®; that are 
analytic functions of C’’*! to C, provided there exists a positive solution of the characteristic 


system within the domain of analyticity of the ®;. This extension then unifies the DLW theorem 
and Theorem VII.3 relative to the smooth implicit function schema. dq 


> VII.28. Polya systems. Woods [503] has shown that Pélya operators of the form MSET; 
can also be treated by an extension of the DLW Theorem, which then unifies this theorem and 
Theorem VIL.4. dq 


> VIL.29. Infinite systems. Lalley [325] has extended the conclusions of the DLW Theorem to 
certain infinite systems of generating function equations. This makes it possible to quantify the 
return probabilities of certain random walks on infinite free products of finite groups. dq 


The square-root singularity property ceases to be universal when the assumptions 
of Theorems VIL.5 and VIL.6, namely, positivity and irreducibility, fail to be satisfied. 
For instance, supertrees that are specified by a positive but reducible system have a 
singularity of the fourth-root type (Example 10, p. 394 and Example 18, p. 479). We 
discuss in the next section, §VII. 7, general methods that apply to any algebraic func- 
tion and are based on the minimal polynomial equation (rather than system) satisfied 
by the function. Note that the results there do not always subsume the present ones, 
since structure is not preserved when a system is reduced, by elimination, to a single 
equation. (It would be desirable to determine directly from the system itself the type 
of singular behaviour of the solution, but the systematic research involved in such a 
programme has not been yet been carried out.) 


VII. 7. The general analysis of algebraic functions 


Algebraic series and algebraic functions are simply defined as solutions of a poly- 
nomial equation or system. Their singularities are strongly constrained to be branch 
points, with the local expansion at a singularity being a fractional power series known 
as a Newton-Puiseux expansion; see Subsection VII.7.1. Singularity analysis then 
appears to be systematically applicable to algebraic functions, to the effect that their 
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coefficients are asymptotically composed of elements of the form 

(66) Cw nP/4, 7 2.0 {ie 

cf Subsection VII. 7.2. This last form includes as a special case the exponent : = -3, 
that was encountered repeatedly, when dealing with inverse functions, implicit func- 
tions, and irreducible systems. In this section, we develop the basic structural results 
that lead to the asymptotic forms (66). Coming up however with effective solutions 
(i.e., decision procedures) is not obvious in the algebraic case, and a number of al- 
gorithms are described in order to locate and analyse singularities (Newton’s polygon 
method). In particular, the multivalued character of algebraic functions creates a need 
to solve connection problems. 


Basics. We adopt as the starting point of the present discussion the following 
definition of an algebraic function or series (see also Note 30 for a variant). 


Definition VII.6. A function f(z) analytic in a neighbourhod V of a point zo is said 
to be algebraic if there exists a (nonzero) polynomial P(z,y) € C{z, y], such that 


(67) P(z, f(z)) =9, zeEYV. 


A power series f € Cz] is said to be an algebraic power series if it coincides with 
the expansion of an algebraic function at 0. 


The degree of an algebraic series or function f is by definition the minimal value 
of deg, P(z,y) over all polynomials that are cancelled by f (so that rational series 
are algebraic of degree 1). One can always assume P to be irreducible over C (that is 
P = QR implies that one of Q or R is a scalar) and of minimal degree. 

An algebraic function may also be defined by starting with a polynomial system 
of the form 

Pi(z,Y1,---,Ym) = 0 
(68) ~ es 
Pm(Z5,Y15-+-5Ym) = 0, 
where each P; is a polynomial. A solution of the system (68) is by definition an m- 
tuple (fi,..., fm) that cancels each P;, that is, P;(z, fi,..., fm) = 0. Any of the 
f; is called a component solution. A basic but nontrivial result of elimination theory 
is that any component solution of a nondegenerate polynomial system is an algebraic 
series (APPENDIX B: Algebraic elimination, p. 685). In other words, one can elimi- 
nate the auxiliary variables y2,..., Ym and construct a single bivariate polynomial Q 
such that Q(z, yi) = 0. 

We stress the point that, in the definitions by equation (67) or system (68), no 
positivity of any sort nor irreducibility is assumed. The foregoing treatment applies to 
any algebraic function, whether or not it comes from combinatorics. 


> VIL.30. Algebraic definition of algebraic series. It is also customary to define f to be an 
algebraic series if it satisfies P(z, f) = 0 in the sense of formal power series, without a priori 
consideration of convergence issues. Then the technique of majorant series may be used to 
prove that the coefficients of f grow at most exponentially. Thus, the alternative definition is 
indeed equivalent to Definition VII.6. <q 
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> VIL31. “Alg is in Diag of Rat”. Every algebraic function F(z) over C(z) is the diagonal of 
arational function G(x, y) = A(z, y)/B(a2,y) € C(a, y). Precisely: 


F(z) = se Giaes where G(x,y) = o> Gmjn2™y”. 
n>0 m,n>0 


This is granted by a theorem of Denef and Lipshitz [120], which is related to the holonomic 
framework (APPENDIX B: Holonomic functions, p. 693). <q 


> VIL32. Multinomial sums and algebraic coefficients. Let F(z) be an algebraic function. 
Then F,, = [z”] F(z) is a (finite) linear combination of “multinomial forms” defined as 


Sn(C;h3c1,..-,Cr) ce Rae Jett set 
1,-++5Nr 


Cc 
where the summation is over all values of no,71,...,, Satisfying a collection of linear in- 
equalities C involving n. [Hint: a consequence of Denef—Lipshitz. ] <q 


VII. 7.1. Singularities of general algebraic functions. Let P(z, y) be an irre- 
ducible polynomial of C[z, y], 


P(z,y) = po(z)y* + pr(z)y** +--+ + pa(z)- 
The solutions of the polynomial equation P(z, y) = 0 define a locus of points (z, y) 
in C x C that is known as a complex algebraic curve. Let d be the y-degree of P. 
Then, for each z there are at most d possible values of y. In fact, there exist d values 
of y “almost always”, that is except for a finite number of cases: 

— If Zo is such that po(zo) = 0, then there is a reduction in the degree in y and 
hence a reduction in the number of finite y-solutions for the particular value 
of z = 2. One can conveniently regard the points that disappear as “points 
at infinity” (formally, one then operates in the projective plane). 

— If zo is such that P(zo, y) has a multiple root, then some of the values of y 
will coalesce. 


Define the exceptional set of P as the set (R is the resultant of APPENDIX B: Alge- 
braic elimination, p. 685): 


(69) BLP] = {z | R(z) i O}, R(z) = R(P(z,y), WyP(z,y),y)- 


The quantity R(z) is also known as the discriminant of P(z,y), with z taken as a 
parameter. If z ¢ =[P], then we have a guarantee that there exist d distinct solutions 
to P(z, y) = 0, since po(z) # 0 and 0, P(z, y) # 0. Then, by the Implicit Function 
Theorem, each of the solutions y; lifts into a locally analytic function y;(z). A branch 
of the algebraic curve P(z,y) = 0 is the choice of such a y;(z) together with a 
simply connected region of the complex plane throughout which this particular y;(z) 
is analytic. 

Singularities of an algebraic function can thus only occur if z lies in the excep- 
tional set =[P]. At a point zo such that po(zo) = 0, some of the branches escape to 
infinity, thereby ceasing to be analytic. At a point z9 where the resultant polynomial 
R(z) vanishes but po(z) # 0, then two or more branches collide. This can be either 
a multiple point (two or more branches happen to assume the same value, but each 
one exists as an analytic function around 29) or a branch point (some of the branches 
actually cease to be analytic). An example of an exceptional point that is not a branch 
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-1 0 1 
FIGURE VII.14. The real section of the lemniscate of Bernoulli defined by P(z, y) = 
(2? + y?)? — (2? — y?) = 0: the origin is a double point where two analytic branches 


meet. 


point is provided by the classical lemniscate of Bernoulli: at the origin, two branches 
meet while each one is analytic there (see Figure 14). 

A partial knowledge of the topology of a complex algebraic curve may be gotten 
by first looking at its restriction to the reals. Consider for instance the polynomial 
equation P(z, y) = 0, where 


P(z,y) =y—1-2y’, 


which defines the OGF of the Catalan numbers. A rendering of the real part of the 
curve is given in Figure 15. The complex aspect of the curve as given by S(y) as a 
function of z is also displayed there. In accordance with earlier observations, there are 
normally two sheets (branches) above each point. The exceptional set is given by the 
roots of the discriminant, 


R = z(1 —4z), 


that is, z = 0, i. For z = 0, one of the branches escapes at infinity, while for z = 1/4, 
the two branches meet and there is a branch point; see Figure 15. 

In summary the exceptional set provides a set of possible candidates for the sin- 
gularities of an algebraic function. 


Lemma VII.4 (Location of algebraic singularities). Let y(z), analytic at the origin, 
satisfy a polynomial equation P(z,y) = 0. Then, y(z) can be analytically continued 
along any simple path emanating from the origin that does not cross any point of the 
exceptional set defined in (69). 


PROOF. At any 2 that is not exceptional and for a yo satisfying P(zo, yo) = 0, 
the fact that the discriminant is nonzero implies that P(zo, y) has only a simple root 
at yo. Consequently, we have P,(zo, yo) # 0. By the Implicit Function Theorem, the 
algebraic function y(z) is analytic in a neighbourhood of Zo. 

Nature of singularities. We start the discussion with an exceptional point that 
is placed at the origin (by a translation z +» z+ Zo) and assume that the equation 
P(0,y) = 0 has k equal roots yi,..., yx where y = 0 is this common value (by a 
translation y +> y + y1 or an inversion y +> 1/y, if points at infinity are consid- 
ered). Consider a punctured disk |z| < r that does not include any other exceptional 
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FIGURE VII.15. The real section of the Catalan curve (top). The complex Catalan curve 
with a plot of S(y) as a function of z = (R(z), S(z)) (bottom left); a blowup of S(y) 
near the branch point at z = + (bottom right). 


point relative to P. In the argument that follows, we let y1, (z),..., yx(z) be analytic 
determinations of the root that tend to 0 as z — 0. 

Start at some arbitrary value interior to the real interval (0,7), where the quantity 
yi(z) is locally an analytic function of z. By the implicit function theorem, y;(z) can 
be continued analytically along a circuit that starts from z and returns to z while simply 
encircling the origin (and staying within the punctured disk). Then, by permanence of 
analytic relations, y;(z) will be taken into another root, say, yo) (z). By repeating the 
process, we see that after a certain number of times « with 1 < K < k, we will have 
obtained a collection of roots y:(z) = yo) (2), i285 ys”) (z) = y1(<) that form a set of 
« distinct values. Such roots are said to form a cycle. In this case, y;(¢”) is an analytic 
function of t except possibly at 0 where it is continuous and has value 0. Thus, by 
general principles (regarding removable singularities), it is in fact analytic at 0. This 
in turn implies the existence of a convergent expansion near 0: 


(70) u(t®) = S- ent”. 
n=1 


(The parameter ¢ is known as the local uniformizing parameter, as it reduces a multi- 
valued function to a single value one.) This translates back into the world of z: each 
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determination of z!/" yields one of the branches of the multivalued analytic function 
as 
(71) yi(z) = S- aga 

n=1 


Qin/K 


Alternatively, with w = e a root of unity, the « determinations are obtained as 


Ww @) = Do enw, 
n=1 


each being valid in a sector of opening < 27. (The case & = 1 corresponds to an 
analytic branch.) 

If r = k, then the cycle accounts for all the roots which tend to 0. Otherwise, 
we repeat the process with another root and, in this fashion, eventually exhaust all 
roots. Thus, all the & roots that have value 0 at z = 0 are grouped into cycles of size 
K1,---,K¢. Finally, values of y at infinity are brought to zero by means of the change 
of variables y = 1/u, then leading to negative exponents in the expansion of y. 


Theorem VII.7 (Newton—Puiseux expansions at a singularity). Let f(z) be a branch 
of an algebraic function P(z, f(z)) = 0. Ina circular neighbourhood of a singu- 
larity ¢ slit along a ray emanating from ¢, f(z) admits a fractional series expansion 
(Puiseux expansion) that is locally convergent and of the form 


Fle) = So exlz-— OF", 


k>ko 


for a fixed determination of (z — Cys where ky € Zand « is an integer > 2, called 
the “branching type”. 


Newton (1643-1727) discovered the algebraic form of Theorem VII.7, published 
it in his famous treatise De Methodis Serierum et Fluxionum (completed in 1671). This 
method was subsequently developed by Victor Puiseux (1820-1883) so that the name 
of Puiseux series is customarily attached to fractional series expansions. The argument 
given above is taken from the neat exposition offered by Hille in [269, Ch. 12, vol. II]. 
It is known as a “monodromy argument”, meaning that it consists in following the 
course of values of an analytic function along paths in the complex plane till it returns 
to its original value. 

Newton polygon. Newton also described a constructive approach to the deter- 
mination of branching types near a point (zo, yo), that by means of the previous dis- 
cussion can always be taken to be (0,0). In order to introduce the discussion, let us 
examine the Catalan generating function near z) = 1/4. Elementary algebra gives the 
explicit form of the two branches 


n(2)=5- (1-vI-&), we) = 5 (It vI-®), 


whose forms are consistent with what Theorem VII.7 predicts. If however one starts 
directly with the equation, 


P(z,y)=y—1-zy’=0 
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then, the translation z = 1/4 — Z (the minus sign is a mere notational convenience), 
y=2+Y yields 


i 
(72) QO(Z,Y)= ar a +4Z+4ZY + ZY?. 


Look for solutions of the form Y = cZ%(1+0(1)) with c £ 0, whose existence is a pri- 
ori granted by the Newton-Puiseux Theorem. Each of the monomials in (72) gives rise 
to a term of a well determined asymptotic order, respectively 72%, Z!, Z°+!, Z20+1, 
If the equation is to be identically satisfied, then the main asymptotic order of Q(Z, Y) 
should be 0. Since c 4 0, this can only happen if two or more of the exponents in 
the sequence (2a, 1,a@ + 1,2a + 1) coincide and the coefficients of the correspond- 
ing monomial in P(Z,Y) is zero, a condition that is an algebraic constraint on the 
constant c. Furthermore, exponents of all the remaining monomials have to be larger 
since by assumption they represent terms of lower asymptotic order. 

Examination of all the possible combinations of exponents leads one to discover 
that the only possible combination arises from the cancellation of the first two terms 
of Q, namely —4Y? + 4Z, which corresponds to the set of constraints 

2a = 1, fo pieG: 
4 
with the supplementary conditions a + 1 > 1 and 2a + 1 > 1 being satisfied by this 
choice a = 3. We have thus discovered that Q(Z, Y) = 0 is consistent asymptotically 
with 
be Vee a Wee 

The process can be iterated upon subtracting dominant terms. It invariably gives 
rise to complete formal asymptotic expansions that satisfy Q(Z, Y) = 0 (in the Cata- 
lan example, these are series in +Z!/?). Furthermore, elementary majorizations estab- 
lish that such formal asymptotic solutions represent indeed convergent series. Thus, 
local expansions of branches have indeed been determined. 

An algorithmic refinement (also due to Newton) can be superimposed on the pre- 
vious discussion and is known as the method of Newton polygons. Consider a general 
polynomial 


QELS LV, 
ged 
and associate to it the finite set of points (a;,6;) in N x N, which is called the Newton 
diagram. It is easily verified that the only asymptotic solutions of the form Y « Z7 
correspond to values of 7 that are inverse slopes (ie., Ax/Ay) of lines connecting 
two or more points of the Newton diagram (this expresses the cancellation condition 
between two monomials of Q) and such that all other points of the diagram are on this 
line or to the right of it. In other words: 


Newton’s polygon method. Any possible exponent T such that Y ~ cZ™ is 
a solution to a polynomial equation corresponds to one of the inverse slopes 
of the leftmost convex envelope of the Newton diagram. For each viable T, 
a polynomial equation constrains the possible values of the corresponding 
coefficient c. Complete expansions are obtained by repeating the process, 
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FIGURE VII.16. The real algebraic curve defined by the equation P = (y — x”)(y? — 
x)(y* — a°) — ay? near (0,0) (left) and the corresponding Newton diagram (right). 


which means deflating Y from its main term by way of the substitution Y > 


Y—cZ". 
Figure 16 illustrates what goes on in the case of the curve P = 0 where 
Peg) = Yor aay 2) — 2°y° 


yp — yz — yt 2? + y22? — 2 234? + 2ty + zy? — 25, 


considered near the origin. As the partly factored form suggests, we expect the curve 
to resemble the union of two orthogonal parabolas and of a curve y = £2°/? having a 
cusp, i.e., the union of 


y= 27, y= +/z, y= +23/?, 


respectively. It is visible on the Newton diagram of the expanded form that the possible 
exponents y cx 27 at the origin are the inverse slopes of the segments composing the 
envelope, that is, 


For computational purposes, once determined the branching type k, the value of 
ko that dictates where the expansion starts, and the first coefficient, the full expansion 
can be recovered by deflating the function from its first term and repeating the New- 
ton diagram construction. In fact, after a few initial stages of iteration, the method 
of indeterminate coefficients can always be eventually applied''. Computer algebra 
systems usually have this routine included as one of the standard packages; see [427]. 


Bruno Salvy, private communication, August 2000 
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VII. 7.2. Asymptotic form of coefficients. The Newton—Puiseux theorem de- 

scribes precisely the local singular structure of an algebraic function. The expansions 
are valid around a singularity and, in particular, they hold in indented disks of the type 
required in order to apply the formal translation mechanisms of singularity analysis 
(Chapter VI). 
Theorem VIL.8 (Algebraic asymptotics). Let f(z) = )0,, fnz” be an algebraic se- 
ries. Assume that the branch defined by the series at the origin has a unique dominant 
singularity at z = a on its circle of convergence. Then, the coefficient fy, satisfies 
the asymptotic expansion, 


(73) nwa” Sa = cal 
k>ko 


where kg € Z and & is an integer > 2. 
If f(z) has several dominant singularities |a,| = |a2| = ++: = 
exists an asymptotic decomposition (where € is some small fixed number, € > 0) 


(74) tn =o + O((lai]+6))™, 


where each I) (n) admits a complete asymptotic expansion, 


6) (n) os oes S- dD y-1-k/ 83 


k>k 


with KY) in Z, and k; an integer > 2. 
PRooF. An early version of this theorem apppears as [174, Th. D, p. 293]. The 
expansions granted by Theorem VII.7 are of the exact type required by singularity 
analysis (Theorem VI.4, p. 376). For multiple singularities, Theorem VI.5 (p. 381) 
based on composite contours is to be used: in that case each @) (7) is the contribution 
obtained by transfer of the corresponding local singular element. 
In the case of multiple singularities, partial cancellations may occur in some of 
the dominant terms of (74): consider for instance the case of 


1 


1-242? 


where the function has two complex conjugate singularities with an argument not 
commensurate to 7, and refer to the corresponding discussion of rational coefficients 
asymptotics (Subsection IV. 6.1, p. 250). Fortunately, such delicate arithmetic situa- 
tions tend not to arise in combinatorial situations. 


= 14. 0.60z + 0.042? — 0.3623 — 0.40824 —--- , 


EXAMPLE VII.16. Branches of unary-binary trees. the generating function of unary binary 
trees is defined by P(z, f(z)) = 0 where 


P(z,y) =y—2- zy zy’, 


478 VI. APPLICATIONS OF SINGULARITY ANALYSIS 


4 2 


FIGURE VII.17. The real algebraic curve corresponding to non-crossing forests. 


so that 
_ 1-z-Vv1l—2z-327 1-2 (1+ z)(1— 3z) 
f(2) = 2z 7 Qz 
There exist only two branches: f and its conjugate f that form a 2-cycle at z = 3. The 


singularities of all branches are at 0, —1, 3 as is apparent from the explicit form of f or from 
the defining equation. The branch representing f(z) at the origin is analytic there (by a general 
argument or by the combinatorial origin of the problem). Thus, the dominant singularity of f(z) 
is at 3 and it is unique in its modulus class. The “easy” case of Theorem VII.6 then applies 
once f(z) has been expanded near z. As atule, the organization of computations is simpler if 
one makes use of the local uniformizing parameter with a choice of sign in accordance to the 


direction along which the singularity is approached. In this case, we set z = z — 6? and find 


~ 9 52_ 8353, 27a 2997 sg et _ ie 
f(z) =1-364+ 58 — 284+ SoS te, b= (Z-2)?. 


This translates immediately into 


n+1/2 
h=eke~2 (-+ =. te), 


OO lanes 16n | 512n2 8192n3 
The approximation provided by the first three terms is quite good: for n = 10 already, it 
estimates fi9 = 835. with an error less than 1. .............. END OF EXAMPLE VII.16. 


EXAMPLE VII.17. Branches of non-crossing forests. Consider the polynomial equation P(z, y) = 
0, where 


P(z,y) =y? + (2° —2—3)y" + (z+ 3)y-1, 
and the combinatorial GF satisfying P(z, F’) = 0 determined by the initial conditions, 
F(z) =14 22+ 72? + 33z? + 18124 + 1083z° +---. 


(EIS A054727). F(z) is the OGF of non-crossing forests defined in Example 15, p. 462. 
The exceptional set is mechanically computed: its elements are roots of the discriminant 


R= —23(529 — 82" — 32z +4). 
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Newton’s algorithm shows that two of the branches at 0, say yo and y2, form a cycle of length 2 
with yo = 1— W/Z + O(z), yo = 14+ Vz + O(z) while it is the “middle branch” y; = 
1+ 2+ O(z?) that corresponds to the combinatorial GF F(z). 

The nonzero exceptional points are the roots of the cubic factor of ?, namely 


Q = {—1.93028, 0.12158, 3.40869}. 


Let € = 0.1258 be the root in (0,1). By Pringsheim’s theorem and the fact that the OGF 
of an infinite combinatorial class must have a positive dominant singularity in [0, 1], the only 
possibility for the dominant singularity of yi (z) is €. (For a more general argument, see below.) 

For z near €, the three branches of the cubic give rise to one branch that is analytic with 
value approximately 0.67816 and a cycle of two conjugate branches with value near 1.21429 
at z = €. The expansion of the two conjugate branches is of the singular type, 


at BVY1—2z/ 7 


where 


43 18, 35... 1 
= 4+ Se -— Se? = 1.21429, B= — 1/22 1 2902 = 0.14931. 
37 376 746 9, 6 37 8 — 981€ — 52902 + 0.1493 


The determination with a minus sign must be adopted for representing the combinatorial GF 
when z — €7 since otherwise one would get negative asymptotic estimates for the nonnegative 
coefficients. Alternatively, one may examine the way the three real branches along (0, €) match 
with one another at 0 and at € , then conclude accordingly. 

Collecting partial results, we finally get by singularity analysis the estimate 


B zt 1 ) 1. 
Dea 14+O0(-)]), = = = 8.22469 
aan Ge 2 & 


with the cubic algebraic number € and the sextic @ as above. . END OF EXAMPLE VII.17. 


The example above illustrates several important points in the analysis of coeffi- 
cients of algebraic functions when there are no simple explicit radical forms. First of 
all a given combinatorial problem determines a unique branch of an algebraic curve 
at the origin. Next, the dominant singularity has to be identified by “connecting” the 
combinatorial branch with the branches at every possible singularity of the curve. Fi- 
nally, computations tend to take place over algebraic numbers and not simply rational 
numbers. 

So far, examples have illustrated the common situation where the exponent at the 
dominant singularity is 3, which is reflected by a factor of n~?/? in the asymptotic 
form of coefficients. Our last example shows a case where the exponent assumes a 
different value, namely i. 


EXAMPLE VII.18. Branches of “supertrees”. Consider the quartic equation 
y’ —2y? + (14+2z)y? -—2yz4+422 =0 


and let KX be the branch analytic at 0 determined by the initial conditions: 


K(z) = 2274229 +8244 182° +4642° + 18827 4+---. 


The OGF K corresponds to bicoloured “‘supertrees” of Example VI.10, p. 394. 
The discriminant is found to be 


R = 1624 (1627 +4z—-1) (-14+42)°, 
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FIGURE VII.18. The real algebraic curve associated to the generating function of su- 
pertrees of type Kk. 


with roots at + and (—1 + /5)/8. The dominant singularity of the branch of combinatorial 
interest turns out to be at z = + where K(4) — 3. The translation z = + +Z,y= 5 + Y 
then transforms the basic equation into 


AY*+8ZY"4+1627°4+127°+Z=0. 


According to Newton’s polygon, the main cancellation arises from 4Y“ + Z = 0: this cor- 
responds to a segment of inverse slope 1/4 in the Newton diagram and accordingly to a cycle 
formed with 4 conjugate branches, i.e., a fourth-root singularity. Thus, one has, 


1 1 1/4 1 1 3/4 4” 
Rye aes ny eae a. PUR: ds oa 
Coe / 5: (+ :) 2 (+ :) Te. LRU ar(2yni/* 


which is consistent with earlier found values (p. 394). ........ END OF EXAMPLE VII.18. 


Computable coefficient asymptotics. The previous discussion contains the germ 
of a complete algorithm for deriving an asymptotic expansion of coefficients of any 
algebraic function. We sketch here the main principles leaving some of the details to 
the reader. Observe that the problem is a connection problem: the “shapes” of the 
various sheets around each point (including the exceptional points) are known, but 
it remains to connect them together and see which ones are encountered first when 
starting from a given branch at the origin. 


Algorithm ACA: Algebraic Coefficient Asymptotics. 
Input: A polynomial P(z,y) with d = deg, P(z,y); a series Y(z) such that 
P(z,Y) = Oand assumed to be specified by sufficiently many initial terms so as to 
be distinguished from all other branches. 
Output: The asymptotic expansion of [z”]Y (z) whose existence is granted by The- 
orem VIL.8. 
The algorithm consists of three main steps: Preparation, Dominant singularities, 
and Translation. 
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I. Preparation: Define the discriminant R(z) = R(P, Pj, y). 


(Pi) Compute the exceptional set = = {z | R(z) = 0} and the points of infinity E> = 
{z | po(z) = 0}, where po(z) is the leading coefficient of P(z,y) considered as a 
function of y. 

(P2) Determine the Puiseux expansions of all the d branches at each of the points of 
= U {0} (by Newton diagrams and/or indeterminate coefficients). This includes the 
expansion of analytic branches as well. Let {ya,j(z)}4=1 be the collection of all 
such expansions at some a € © U {O}. 

(P3) Identify the branch at 0 that corresponds to Y (z). 


I. Dominant singularities (Controlled approximate matching of branches). Let =1,=2,... 
be a partition of the elements of = U {0} sorted according to the increasing values of their 
modulus: it is assumed that the numbering is such that if a € ©; and 8 € &;, then |a| < || is 
equivalent to 2 < 7. Geometrically, the elements of = have been grouped in concentric circles. 
First, a preparation step is needed. 


(D1) Determine a nonzero lower bound 6 on the radius of convergence of any local Puiseux 
expansion of any branch at any point of =. Such a bound can be constructed from 
the minimal distance between elements of = and from the degree d of the equation. 


The sets =; are to be examined in sequence until it is detected that one of them contains a sin- 
gularity. At step j, let 01, 02,...,05 be an arbitrary listing of the elements of =;. The problem 
is to determine whether any ox is a singularity and, in that event, to find the right branch to 
which it is associated. This part of the algorithm proceeds by controlled numerical approxima- 
tions of branches and constructive bounds on the minimum separation distance between distinct 
branches. 


(D2) For each candidate singularity ox, with k > 2, set C, = o%(1 — 6/2). By assump- 
tion, each ¢;, is in the domain of convergence of Y (z) and of any yo,,j- 

(D3) Compute a nonzero lower bound 7, on the minimum distance between two roots of 
P(Cx,y) = 0. This separation bound can be obtained from resultant computations. 

(D4) Estimate Y(¢;,) and each yo, (Cx) to an accuracy better than 7, /4. If two elements, 
Y(z) and yo,,;(z) are (numerically) found to be at a distance less than 7, for z = 
Cx, then they are matched: ox is a singularity and the corresponding yo,,; 1s the 
corresponding singular element. Otherwise, o; is declared to be a regular point for 
Y (z) and discarded as candidate singularity. 


The main loop on 7 is repeated until a singularity has been detected, when 7 = jo, say. The 
radius of convergence p is then equal to the common modulus of elements of &;,; the corre- 
sponding singular elements are retained. 


II. Coefficient expansion. Collect the singular elements at all the points o determined to 
be a dominant singularity at Phase II. Translate termwise using the singularity analysis rule, 


D(—p/k +n) 
T(—p/«K)T(n +1)’ 


and reorganize into descending powers of n, if needed. 


(o — z)p/s po gP/non 


This algorithm vindicates the following assertion: 


Proposition VIL.5 (Decidability of algebraic connections.). The dominant singulari- 
ties of a branch of an algebraic function can be determined by the algorithm ACA in 
a finite number of operations. 
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VII. 8. Combinatorial applications of algebraic functions 


The next two subsections introduce objects whose construction leads to algebraic 
functions, in a way that extends the basic symbolic method. This includes: walks 
with a finite number of allowed jumps (Subsection VII. 8.1) and planar maps (Subsec- 
tion VII. 8.2). In such cases, bivariate functional equations reflect the combinatorial 
decompositions of objects. The common form of these functional equations is 


(75) @(z,u, F(z, u), hi(z),...,Ar(z)) =0, 


where ® is a known polynomial and the unknown functions are F' and hy,..., he. 
Specific methods are needed in order to attain solutions to such functional equations 
that would seem at first glance to be grossly underdetermined. Random walks lead to 
a linear version of (75) that is treated by the so-called “kernel method”. Maps lead 
to nonlinear versions that are solved by means of Tutte’s “quadratic method”. In both 
cases, the strategy consists in binding z and u by forcing them to lie on an algebraic 
curve (suitably chosen in order to eliminate the dependency on F(z, u)), and then 
pulling out algebraic consequences of such a specialization. 


VII. 8.1. Walks and the kernel method. Start with a set ( that is a finite sub- 
set of Z and is called the set of jumps. A walk (relative to Q) is a sequence w = 
(wo, W1,---,Wn) such that wo = 0 and wi41 — wi € Q, forall i,0 << i<nA 
nonnegative walk (also known as a “meander’’) satisfies w; > O and an excursion is 
a nonnegative walk such that, additionally, w, = 0. A bridge is a walk such that 
Wn = 0. The quantity n is called the length of the walk or the excursion. For in- 
stance, Dyck paths and Motzkin paths analysed in Section V. 3, p. 295, are excursions 
that correspond to Q = {—1,+1} and Q = {—1,0,+1} respectively. (Walks and 
excursions are also somewhat related to paths in graphs in the sense of Section V. 6, 
p. 340.) 

We let —c denote the smallest (negative) value of a jump, and d denote the largest 
(positive) jump. A fundamental réle is played in this discussion by the characteristic 
polynomial of the walk, 


d 
S@) = doy” = Sy’, 
wEQ j=-c 


which is a Laurent polynomial”, that is, it involves negative powers of the variable y. 


Walks. Observe first the rational character of the BGF of walks, with z marking 
length and wu marking final altitude: 


1 
1—2S(u) 


Since walks may terminate at a negative altitude, this is a Laurent series in wu. 


(76) W(z,u) = 


2 Qisa set, then the coefficients of S lie in {0, 1}. The treatment above applies in all generality to 
cases where the coefficients are arbitrary positive real numbers. This accounts for probabilistic situations 
as well as multisets of jump values. 
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Bridges. The GF of bridges is formally as [u°]W (z, u), since they correspond to 
walks that end at altitude 0. Thus one has 


1 1 du 
Bel oie i T7580) a 


upon integrating along a circle y that separates the small and large branches. The 
integral can then be evaluated by residues (details are in [21]). 


Excursions and meanders. We propose next to determine the number fF”, of ex- 
cursions of length n and type Q, via the corresponding OGF 


F(z) = Ss Fyz”. 


n=0 


In fact, we shall determine the more general BGF 


F(z,u):= S- F, pu®2”, 
n,k 


where F’,,, is the number of nonnegative walks (meanders) of length n and final alti- 
tude k (i.e., the value of w,, in the definition of a walk is constrained to equal k). In 
particular, one has F'(z) = F(z, 0). 

The main result to be proved below is the following: For each finite set Q € Z, 
the generating function of excursions is an algebraic function that is explicitly com- 
putable from Q. There are many ways to view this result. The problem is usually 
treated within probability theory by means of Wiener-Hopf factorizations [413], and 
Lalley [324] offers an insightful analytic treatment under this angle. On another reg- 
ister, Labelle and Yeh [320] show that an unambiguous context-free specification of 
excursions can be systematically constructed, a fact that is sufficient to ensure the al- 
gebraicity of the GF F(z). (Their approach is implicitly based on the construction of 
a finite pushdown automaton itself equivalent, by general principles, to a context-free 
grammar.) The Labelle- Yeh construction reduces the problem to a large, but somewhat 
“blind”, combinatorial preprocessing. Accordingly, for analysts, it has the disadvan- 
tage of not extracting a simpler analytic (and noncombinatorial) structure inherent in 
the problem: the shape of the end result is predicted by the Drmota-Lalley-Woods 
Theorem, but the nature of the constants is not accessible in this way. 

The method described below is often known as the kernel method. It takes some 
of its inspiration from exercises in the 1968 edition of Knuth’s book [300] (Ex. 2.2.1.4 
and 2.2.1.11), where a new approach was proposed to the enumeration of Catalan and 
Schréder objects. The technique has since been extended and systematized by several 
authors; see for instance [20, 21, 70, 158, 159] for relevant combinatorial works. Our 
presentation below follows that of Lalley [324] and Banderier-Flajolet [21]. 


The polynomial f,,(w) = [z”]F'(<, u) is the generating function of nonnegative 
walks of length n, with u recording final altitude. A simple recurrence relates f,,+1(w) 
to f,(u), namely, 


(77) Ffnti(u) = S(u) > fn(u) — tr(u), 
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where r,,(u) is a Laurent polynomial consisting of the sum of all the monomials of 
S(u) fn(w) that involve negative powers!® of u: 
-1 
(78) Malte) = > w ((w!] Su) fa(u)) = {uw} Su) fal) 
jac 
The idea behind the formula is to subtract the effect of those steps that would take the 
walk below the horizontal axis. For instance, one has 


stu) = = +00) alu) = = fa(0) 

sy) =F +S +00) + m= (B+) p+ Sno, 
and generally, 
(79) Ay(u) = tu} Su). 


Thus, from (77) and (78) (multiply by z”+! and sum), the generating function 
F(z, u) satisfies the fundamental functional equation 


(80) F(z,u) =1+29(u)F(z,u) — z{u<°} (S(u) F(z, u)). 
Explicitly, one has 


u=0 


(81) F(z,u) = 1+ 28(u)P(z,u) — 25° 4(u) parte] ; 
j=0 


for Laurent polynomials 4, (uw) that depend on Sw) in an effective way by (79). 

The main equations (80) and (81) involve one unknown bivariate GF, F'(z, u) 
and c univariate GFs, the partial derivatives of F’ specialized at u = 0. It is true, but 
not at all obvious, that the single functional equation (81) fully determines the c + 1 
unknowns. The basic technique is known as “cancelling the kernel” and it relies on 
strong analyticity properties; see the book by Fayolle et al. [159] for deep ramifica- 
tions in the study of 2-dimensional walks. The form of (81) to be employed for this 
purpose starts by grouping on one side the terms involving F'(z, wu), 


ca ee w) 


Oud 


(82) F(z,u)(1 — zS(u)) =1- 23° Gs), Gj(z) := | 
j=0 


If the right side was not present, then the solution would reduce to (76). In the case at 
hand, from the combinatorial origin of the problem and implied bounds, the quantity 
F(z, u) is bivariate analytic at (z,u) = (0,0) (by elementary exponential majoriza- 
tions on the coefficients). The main principle of the kernel method consists in coupling 
the values of z and wu in such a way that 1 — zS(u) = 0, so that F(z, u) disappears 
from the picture. A condition is that both z and u should remain small (so that F' re- 
mains analytic). Relations between the partial derivatives are then obtained from such 
a specialization, (z,u) +> (z, u(z)), which happen to be just in the right number. 


13The convenient notation {u<°} denotes the singular part of a Laurent expansion: {u<°} f(z) := 


DL i<0 ([u] f(u)) ud, 
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Consequently, we consider the “kernel equation”, 
(83) 1— zS(u) =0, 


which is rewritten as 
uo = z+ (u°S(u)). 

Under this form, it is clear that the kernel equation (83) defines c + d branches of an 
algebraic function. A local analysis shows that, amongst these c + d branches, there 
are c branches that tend to 0 as z — 0 while the other d tend to infinity as z — 0. (The 
idea is that, in the equation (83), either one of zu~° = 1 or zul ~ 1 predominates; 
equivalently, a Newton polygon can be constructed.) Let uo(z),...,Uc—1(z) be the 
c branches that tend to 0, that we call “small” branches. In addition, we single out 
uo(z), the “principal” solution, by the reality condition 


uo(z) ~ y2/*,  -y:= (Se)/° € Rso (z > 07). 
By local uniformization (70), the conjugate branches are given locally by 
ug(z) = uo(e2’* z) (z > 0°). 


Coupling z and u by u = ue(z) produces interesting specializations of Equa- 
tion (82). In that case, (z, wu) is close to (0,0) where F is bivariate analytic so that the 
substitution is admissible. By substitution, we get 


cl ji 
(84) 1-z)) dj (we(z)) sare] £=0..c—1. 
j=0 ue 
This is now a linear system of c equations in c unknowns (the partial derivatives) with 
algebraic coefficients that, in principle, determines F(z, 0). 
A convenient approach to the solution of (84) is due to Mireille Bousquet-Mélou. 
The argument goes as follows. The quantity 


cl ; 
c c of 
(85) M(u) := uo — zu y Aj (ua GF (2,0) 
can be regarded as a polynomial in u. It is monic while it vanishes by construction at 
the c small branches uo, ..., ue—1. Consequently, one has the factorization, 
cl 
(86) M(u) = [[(u—w(z)). 
£=0 


Now, the constant term of M/(u) is otherwise known to equal —zS_.F'(z,0), by the 
definition (85) of M/(u) and by Equation (79) specialized to \9(w). Thus, the compar- 
ison of constant terms between (85) and (86) provides us with an explicit form of the 
OGF of excursions: 

ae a 


F(z,0) = Sos 


ue(z). 
£=0 
One can then finally return to the original functional equation and pull the BGF F(z, u). 


In summary: 
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Proposition VII.6. Let Q be a finite step of jumps and let S(u) be the characteristic 
polynomial of Q. Consider the c small branches of the “kernel” equation, 


1—zS(u) =0, 
denoted by uo(z),...,We—1(z). The generating function of excursions is given by 
Glas 


= Il ue(z), where S_. = [u~‘]S(u) 

2S—e L=0 
is the multiplicity (or weight) of the smallest element —c € Q. More generally the 
bivariate generating function of nonnegative walks (also known as meanders) with u 


marking final altitude is bivariate algebraic and given by 


1 ce—1 
F(z,u) = Forse I (u— ue(z)). 


The OGF of bridges is expressible in terms of the small branches, by 


Biz) = > — es log iy (Sas aey: 


EXAMPLE VII.19. Trees and Lukasiewicz codes. A particular class of walks is of special 
interest; it corresponds to cases where c = 1, that is, the largest jump in the negative direction 
has amplitude 1. Consequently, Q + 1 = {0,s1,82,..., qa}. In that situation, combinatorial 
theory teaches us the existence of fundamental isomorphisms between walks defined by steps 
Q and trees whose degrees are constrained to lie in 1 + 2. The correspondence is by way of 
Lukasiewicz codes!*, also known as ‘Polish” prefix codes introduced in Chapter I. From this 
correspondence, we expect to find tree GFs in such cases. 

As regards generating functions, there now exists only one small branch, namely the so- 
lution uo(z) to uo(z) = zb(uo(z)) (where ¢(u) = wS(u)) that is analytic at the origin. One 
then has F(z) = F(z, 0) = 4uo(z), so that the walk GF is determined by 


1 
F(z,0) = Suo(z), uolz) = 26(uo(z)), (uw) = uS(u). 
This form is consistent with what is already known regarding the enumeration of simple families 
of trees. In addition, one finds 
1—wutuo(z) — w—u0(z) 
F = 

C= ae) = ade 

Classical specializations are rederived in this way: 
— the Catalan walk (Dyck path), defined by 9 = {—1, +1} and $(u) = 1+ u?, has 

2 = (a- V1- 4); 
— the Motzkin walk, defined by Q = {—1,0,+1} and 4(u) = 1+u-+u? has 


uo(2) = = (1 z—VJ1—2z 32%) 


22 


uo(z) 


4Such a code (p. 70) is obtained by a preorder traversal of the tree, recording a jump of r — 1 when a 
node of outdegree r is encountered. The sequence of jumps gives rise to an excursion followed by an extra 
—1 jump. 
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— the modified Catalan walk, defined by 2 = {—1, 0,0, +1} (with two steps of type 0) 
and ¢(u) = 1+ 2u + u?, has 


1 
ore 2z-—vV1 4z) ; 


— the d-ary tree walk (the excursions encode d-ary trees) defined by 2 = {—1,d— 1}, 
has uo(z) that is defined implicitly by uo(z) = z(1 + uo(z)*). 


UO (z) = 


The kernel method thus provides a new perspective for the enumeration of Dyck paths and 
Pelated:objécts.:.o22 heck deck oe AG eae bah od END OF EXAMPLE VII.19. 


EXAMPLE VII.20. Walks with amplitude at most 2. Take Q = {—2, —1,1, 2}, so that 
S(u) =u? +u 4+utu’. 
Then, wo(z), w1(z) are the two branches that vanish as z — 0 of the curve 
yi =2lt+yty ty’). 
The linear system that determines F'(z,0) and FY,(z, 0) is 


z z 


ts wo(z)2 | uolZ) BGO) Sy wea) = 0 
ner mo) ae ee 


(derivatives are taken with respect to the second argument) and one finds 
1 1 
F(z,0) = —Suo(z)ur(z), — Fu(z,0) = 5 (wo(z) + ua (z) + wo(z)ur(2)). 
This gives the number of walks, through a combination of series expansions, 


F(z) =14 22? + 229 + 1124 4+ 242° + 932° + 27227 + 9712 + 319429 +... . 


A single algebraic equation for F(z) = F'(z,0) is then obtained by elimination (e.g., via 
Groebner bases) from the system: 


ug —z(1+tuotuatug) = 0 
uj —z+utui+ut) = 0 
zF+uou. = O 


Elimination shows that F'(z) is a root of the equation 


ey 2 (1+ oe)y? + 2+ 32)y? — + 2e)g 41 = 0. 


For Q = {—2,—1,0,1, 2}, we find similarly F(z) = —4uo(z)ui(z), where uo, ui are 
the small branches of y? = z(l+yt+ ytyt+ y’); the expansion starts as 


F(z) =1+ 24327 + 92% + 3224 + 120z° + 473z° + 19252” + 803428 + --- , 
(EIS A104184), and F'(z) is a root of the equation 


gty* — 22(1+2)y* + 2(24 z)y? —(1+2z)y+1=0. 


In such cases, the GFs are no longer of the simple tree type. .. END OF EXAMPLE VII.20. 
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The singularities of the branches involved in the statement of Proposition VII.6 
can be worked out in all generality [21, 324]. The roots of the kernel equation (83) are 
singular at points z with value u satisfying the simultaneous set of equations, 


1—zS(u) =0, oe Naas 


where the second equation corresponds to a place where the analytic implicit function 
theorem “fails” to define u as an analytic function of z. The second equation always 
has a positive root 7, corresponding to a positive value of z, which is p = 1/S(r). It 
is then natural to suspect p to be radius of convergence of F(z) and the singularity to 
be of the square-root type (Z!/2), this for reasons seen in the proof of Theorem VII.3 
(the smooth implicit-function schema). These properties are shown in complete detail 
in the article [21], where it is also established that the GF of bridges is of singular type 
Z—'/2 just like for Dyck bridges. 


Proposition VII.7. Define the structural constant t by S'(r) = 0, 7 > 0. Then 
assuming aperiodicity, the number of bridges (B,,) and the number of excursions (Fy) 


satisfy 


where 


1/8) _ (ie! f2s(r)3 1 
Bo=—Vomay = eV grey U4 (sy) 
There, the u; represent the small branches and ug is the branch that is real positive 
as z — 0. 


Proposition VII.7 expresses a universal law of type n—3/? for excursions and 
n—'/? for bridges, a fact otherwise at least partly accessible to classical probability 
theory (e.g., via a local limit theorem for bridges and via Brownian motion for ex- 
cursions). Basic parameters of walks, excursions, bridges, and meanders can then be 
analysed in a uniform fashion [21]. 


VII. 8.2. Maps and the quadratic method. A (planar) map is a connected pla- 
nar graph together with an embedding into the plane. In all generality, loops and 
multiple edges are allowed. A planar map therefore separates the plane into regions 
called faces (Figure 19). The maps considered here are in addition rooted, meaning 
that a face, an incident edge, and an incident vertex are distinguished. In this section, 
only rooted maps are considered!°, When representing rooted maps, we shall agree 
to draw the root edge with an arrow pointing away from the root node, and to take 
the root face as that face lying to the left of the directed edge (represented in grey on 
Figure 19). 


'5Nothing is lost regarding asymptotic properties of random structures when a rooting is imposed. 
The reason is that a map has, with probability exponentially close to 1, a trivial automorphism group; con- 
sequently, almost all maps of m edges can be rooted in 2m ways (by choosing an edge, and an orientation 
of this edge), and there is an almost uniform 2m-to-1 correspondence between unrooted maps and rooted 
ones. 
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FIGURE VII.19. A planar map. 


Tutte launched in the 1960’s a large census of planar maps, with the intention 
of attacking the four-colour problem by enumerative techniques'®; see [73, 471, 472, 
473, 474]. There exists in fact an entire zoo of maps defined by various degree or con- 
nectivity constraints. In this chapter, we shall limit ourselves to conveying a flavour 
of this vast theory, with the goal of showing how algebraic functions arise. The pre- 
sentation takes its inspiration from the book of Goulden and Jackson [244, Sec. 2.9] 

Let M be the class of all maps where size is taken to be the number of edges. Let 
M(z, u) be the BGF of maps with u marking the number of edges on the outside face. 
The basic surgery performed on maps distinguishes two cases based upon the nature 
of the root edge. A rooted map will be declared to be isthmic if the root edge r of map 
jsis an “isthmus” whose deletion would disconnect the graph. Clearly, one has, 


(87) M=04+MOLM™, 


where M) (resp. M(")) represent the class of isthmic (resp. non-isthmic) maps and 
‘o’ is the graph consisting of a single vertex and no edge. There are accordingly two 
ways to build maps from smaller ones by adding a new edge. 


(i) The class of all isthmic maps is constructed by taking two arbitrary maps and 
joining them together by a new root edge, as shown below: 


'6The four-colour theorem to the effect that every planar graph can be coloured using only four colours 
was eventually proved by Appel and Haken in 1976, using structural graph theory methods supplemented 
by extensive computer search. 
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The effect is to increase the number of edges by | (the new root edge) and have the 
root face degree become 2 (the two sides of the new root edge) plus the sum of the 
root face degrees of the component maps. The construction is clearly revertible. In 
other words, the BGF of M( is 


(88) M®(z,u) = zu?M(z,u). 


(it) The class of non-isthmic maps is obtained by taking an already existing map 
and adding an edge that preserves its root node and “cuts across” its root face in some 
unambiguous fashion (so that the construction should be revertible). This operation 
will therefore result in a new map with an essentially smaller root-face degree. For 
instance, there are 5 ways to cut across a root face of degree 4, namely, 


a-- 


us b> zu? + zu + zu3 + ZU + zul. 


In general the effect on a map with root face of degree k is described by the trans- 
formation u* + zu(1 — u*+?)/(1 — wu); equivalently, each monomial g(u) = u* 
is transformed into zu(g(1) — ug(u))/(1 — u). Thus, the OGF of M ”) involves a 
discrete difference operator: 
M(z,1)-uM 
(89) Mm (z,u) = ay tt) = UM (2, 4) 
l—wu 

Collecting the contributions from (88) and (89) in (87) then yields the basic func- 

tional equation, 


M(z,1) —uM(z,u) 


(90) M(z,u) =1+u?zM(z, u)? + uz i 
—u 


The functional equation (90) binds two unknown functions, M(z, uw) and M(z, 1). 
Much like in the case of walks, it would seem to be underdetermined. Now, a method 
due to Tutte and known as the quadratic method provides solutions. Following Tutte 
and the account in [244, p. 138], we consider momentarily the more general equation 


(91) (g1F(z,u) + 92)” = gs, 


where g; = G;(z,u, h(z)) and the G; are explicit functions—here the unknown func- 
tions are F(z, u) and h(z) (cf. M(z,u) and M(z,1) in (90)). Bind wu and z in such 
a way that the left side of (91) vanishes, that is, substitute u = u(z) (a yet unknown 
function) so that g; F' + gz = 0. Since the left-hand side of (91) now has a double root 
in u, so must the right-hand side, which implies 


9s 


(92) g3=0, 


= 0. 


u=u(z) 
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The original equation has become a system of two equations in two unknowns that de- 
termines implicitly h(z) and u(z). From there, elimination provides individual equa- 
tions for u(z) and for h(z). (If needed, F(z, u) can then be recovered by solving a 
quadratic equation.) It will be recognized that, if the quantities gi, g2, g3 are polyno- 
mials, then the process invariably yields solutions that are algebraic functions. 

We now carry out this programme in the case of maps and Equation (90). First, 
isolate M(z, u) by completing the square, giving 


Ll-u+tuez\* _ M(z,1) 
(93) (are.w) 5) = eS ay 


where 


27u4* — 2zu?(u — 1)(2u—1) + (1 — u?) 


a 4u4z?(1—u)? 
Next, the condition expressing the existence of a double root is 
1 2u—1 
——. M(z,1) =0 d ss M (2,1) = 0. 
Q(z, u) + u(1 — u) (z, ) ’ Q(z, u) + u2(1—u)? (2, ) 


It is now easy to eliminate M/(z,1), since the dependency in M is linear, and a 
straightforward calculation shows that u = u(z) should satisfy 


(u?z + (u—1)) (u?z + (u— 1)(2u — 3)) = 0. 
The first parameterization would lead to M(z,1) = 1/z which is not acceptable. 
Thus, u(z) is to be taken as the root of the second factor, with M(z, 1) being defined 
parametrically by 


_ (—4)(2u — 3) a 3u—4 
(94) 2 —————— tN Be aye 


Ue 

We can then eliminate u and get an explicit equation for 1/7, which turns out to be ex- 
plicitly solvable. In summary, we obtain one of the very first results of the enumerative 
theory of maps: 


Proposition VII.8. The OGF of maps admits the explicit form 


= ae 3/2 
(95) M(z) = M(z,1) = —gry (1— 18z— (1-122)*”), 
and the number of maps with n edges, M, = [z"|M(z, 1), is 
(2n)!3” 
96 My =2 
Ce) ni(n + 2)! 


2 
which satisfies asymptotically M, ~ 12”: 
ifies asymp ” eas 
The sequence of coefficients is EJS A000168: 
(97) M(z,1) = 1422+92745423437824+29162° +240572°+ 208494274+---. 


We refer to [244, Sec. 2.9] for detailed calculations (that are nowadays routinely per- 
formed with assistance of a computer algebra system). Currently, there exist many 
applications of the quadratic method to maps satisfying all sorts of combinatorial 
constraints, in particular multiconnectivity; see [429] for a panorama. Interestingly 
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enough, the singular exponent of maps is universally 3, a fact further reflected by 
the n—°/? factor in the asymptotic form of coefficients. Accordingly, randomness 
properties of maps are appreciably different from what is observed in trees and many 
commonly encountered context-free objects (e.g., irreducible ones). 


> VII.33. Lagrangean parametrization of general maps. The change of parameter u = 1—1/w 
reduces (94) to the “Lagrangean form”, 


w 1—4w 
= M(z,1) = ———— 
(8) ae ga MS aaa 
to which the Lagrange Inversion Theorem can be applied, giving back (96). dq 


> VII.34. The number of planar graphs. The asymptotic number of labelled planar graphs 
with n vertices was determined by Giménez and Noy [235] to be of the form 


Gnw~g- yn? nl, g = 0.4970 04399, y = 27.22687 77685 . 


This spectacular result, which settled a long standing open question, is obtained by a succes- 
sion of combinatorial-analytic steps based on: (7) the enumeration of 3-connected maps (these 
are the same as graphs, due to unique embeddability), which can be performed by the qua- 
dratic method; (iz) the enumeration of 2-connected graphs by Bender, Gao, and Wormald [33]; 
(iii) the integro-differential relations that relate the GFs of 2-connected and 1-connected graphs. 
The authors of [235] also show that a random planar graph is connected with probability as- 
ymptotic to e~” = 0.96325 and the mean number of connected components is asymptotic to 
1+v = 1.03743. 


VII. 9. Ordinary differential equations and systems 


In Part A of this book relative to Symbolic Methods, we have encountered differ- 
ential relations attached to several combinatorial constructions. 


— Pointing: the operation of pointing a specific atom in an object of a combi- 
natorial class C produces a pointed class D = OC. If the generating function 
of C is C(z) (an OGF in the unlabelled case, an EGF in the labelled case), 
then one has 


(99) D=O0C = D(z) = 2X 0(2). 


See Subsections I. 6.2 (p. 79) and II. 6.1 (p. 126). 

— Order constraints: in Subsection II. 6.3 (p. 129), we have defined the boxed 
product A = (B7 «C) to be the modified labelled product comprised of 
pairs of elements such that the smallest label is constrained to lie in the B 
component. The translation over OGFs is 


(100) A=(B7xC) => Ate) =f (a.B(0) Cl ae 


Thus pointing and order constraints systematically lead to integro-differential relation, 
which transform into ordinary differential equations (ODEs) and systems. Another 
rich source of differential equations in combinatorics is provided by the holonomic 
framework (APPENDIX B: Holonomic functions, p. 693). We summarize below some 
of the major methods that can be used to analyse the corresponding generating func- 
tions. Our analytic arguments largely follow the accessible introductions found in the 
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books by Henrici [265] and Wasow [490]. Linear equations are examined in Subsec- 
tion VII. 9.1, some simple nonlinear ODEs in Subsection VII. 9.2. The main applica- 
tions discussed here are relative to trees associated to ordered structures—quadtrees 
and increasing trees principally. 


VII. 9.1. Singularity analysis of linear differential equations. Linear differ- 
ential equations with analytic coefficients have solutions that, near a reasonably well- 
behaved singularity ¢, are of the form 


Z° (log Z)*H(Z), Z:=2-(, 


with @ € C an algebraic number, k € Zo, and H a locally analytic function. The 
coefficients of such equations are composed of elements that are asymptotically of the 
form 

n'(logn)*, B=-0-1, 
in accordance with the general correspondence provided by singularity analysis. For 
instance, a naturally occurring combinatorial structure, the quadtree, gives rise to a 
number sequence that, surprisingly, turns out to be asymptotic to n(VIT-3)/2, 


Regular singularities. Our starting point is a linear ordinary differential equa- 
tion (linear ODE), which we take to be of the form 


(101) co(z)O"Y (z) + e1(z)O" 1Y (z) +--+ +e,¥(z) =0, d= 


The integer r is the order. We assume that there exists a simply connected domain 2. 
in which the coefficients c; = c;(z) are analytic. At a point zo where co(zo) # 0, a 
classical existence theorem (Note 35 and [490, p. 3]) guarantees that, in a neighbour- 
hood of zo, there exist r linearly independent analytic solutions of the equation. Thus, 
singularities can only occur at points ¢ that are roots of the leading coefficient co(z). 

> VIL35. Analytic solutions. Consider the ODE (101) near zo = 0 and assume co(0) 4 0. 
Then, a formal solution Y (z) can be determined, given any set of initial conditions YY) (0) = 


wj;, by the method of indeterminate coefficients. The coefficients can be constructed recurrently, 
and simple bounds show that they are of at most exponential growth. <q 


To proceed, we rewrite Equation (101) as 
(102) OvY (z) + di(z)O" 1 (z) +--+ +d,(z)¥(z) =0, 


where d; = c;/co. Under our assumptions, the functions d;(z) are now meromorphic 
in 2. Given a meromorphic function f(z), we define we (f) to be the order of the pole 
of f at ¢, and we(f) = 0 means that f(z) is analytic at ¢. 
Definition VII.7. The differential equations (101) and (102) are said to have a sin- 
gularity at ¢ if at least one of the w¢(d;) is positive. The point ¢ is said to be a regular 
singularity if 

we(di) <1, we(de)<2, ..., weld) <r, 


an irregular singularity otherwise. 


For instance, the second-order ODE 


(103) Y" + 27-'sin(z)Y’ — z~* cos(z)Y = 0, 
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has a regular singular point at z = 0, since the orders are 0,2, respectively. It is a 
notable fact that, even though we do not know how to solve explicitly the equation in 
terms of usual special functions, the asymptotic form of its solutions can be precisely 
determined. 

Let ¢ be a regular singular point, and say we attempt to solve (103) by trying 
a solution of the form z° + ---. For instance, proceeding somewhat optimistically 
with (103) at ¢ = 0, we may expect the left hand side of the equation to be of the form 


[o(@ —1)2°- 2 +---] + [29 2 +...) — [29 @4---] = 0. 


In order to obtain cancellation to main asymptotic order (z°~), we must then assume 
that the coefficient of z°~? vanishes; then, @ solves an algebraic equation of degree 2, 
namely, 6(6 — 1) — 1 = 0, which suggests the possibility of two solutions of the form 
z® near 0, with @ = (1 + V5)/2. This informal discussion motivates the following 
definition. 


Definition VII.8. Given an equation of the form (102) and a regular singular point ¢, 
the indicial polynomial I(0) at ¢ is defined to be 


I(0) = 05 + 6,024 +--+ +6, o£ :-= 6(9 —1)---(@+£-1), 


where 6; := limz.¢(z — ¢)?d;(z). The indicial equation (at ¢) is the algebraic 
equation I(6) = 0. 

If we let £ denote the differential operator corresponding to the left hand side 
of (102), we have formally, at a regular singular point 


LIZ = 1) 2°" +O(Ze"""), — Z=(e-9), 


which justifies the rdle of the indicial polynomial. (The process used to determine 
the solutions by restricting attention to dominant asymptotic terms is analogous to 
the Newton polygon construction for algebraic equations.) An important structure 
theorem describes the possible types of solutions of a meromorphic ODE at a regular 
singularity. 

Theorem VIL.9 (Regular singularities of ODEs). Consider a meromorphic differen- 
tial equation (102) and a regular singular point ¢. Assume that the indicial equation 
at ¢, I(@) = 0, is such that no two roots differ by an integer (in particular, all roots 
are distinct). Then, in a slit neighbourhood of ¢, there exists a linear basis of all the 
solutions that is comprised of functions of the form 


(104) (z— €)* Hj(z— ¢), 


where 0,,...,0, are the roots of the indicial polynomial and each H, is analytic at 0. 
In the case of roots differing by an integer (or multiple roots), the solutions (104) may 
include additional logarithmic terms involving nonnegative powers of log(z — ¢). 

A description of the logarithmic cases is best based on a matrix treatment of the 
first-order linear system that is equivalent to the ODE [265, 490]. Note 37 discusses 
the representative case of Euler systems, which is explicitly solvable. 


> VIL.36. Singular solutions. In the first case (no two roots differing by an integer), it suffices 
to work out the modified differential equation satisfied by Z~°7 Y(z) and verify that one of 
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its solutions is analytic at ¢: the coefficients of H; satisfy a solvable recurrence, like in the 
nonsingular case, and their growth is verified to be at most exponential. dq 


> VII.37. Euler equations and systems. An equation of the form, 
OY +eaZ a '¥ +---+e,Z°-"Y =0, e,€C, Z:=(z-0), 
is known as an Euler equation. In the case where all roots of the indicial equation are simple, 


a basis of solutions is exactly of the form Z°7. When @ is a root of multiplicity m, the set of 


solutions includes Z° (log Z)?, for p = 0,. — 1. [Euler equations appear for instance in 
the median-of-three quicksort algorithm (307, 434), See [87] for several applications to random 
tree models and the analysis of algorithms.] Euler systems are first order systems of the form 


d A 
i ee 


where A € C’*" is a scalar matrix and Y = (Yi,..., ane is a vector of functions. A formal 
solution is provided by 


Y(z), 


(z— ¢)* = exp (A log(z — ¢)), 
which indicates that the Jordan block decomposition of A plays a réle in the occurrence of 
logarithmic factors of solutions. dq 
> VIL.38. Irregular singularities. The equation 

(1—z)*Y’-Y=0, Y(0)=1 


has Y(z) = exp(z/(1 — z)) as its solution. The point ¢ = 1 is a singularity, but not one 
of the regular type: the solution blows up exponentially near this point. See [265, 490] for 
a general treatment. (The function Y(z) occurs as the EGF of “fragmented permutations” in 
Subsection II. 4.2, p. 115.) The analysis of coefficients in such cases resorts to the saddle point 
method exposed in Chapter VIII. dq 


Theorem VII.10 (Coefficient asymptotics for meromorphic ODEs). Let f(z) be an- 
alytic at 0 and satisfy a linear differential equation 

T #(2) +e) GH le) +--+ en(e) fe) =0 

gard (2) + 2) aa Fe cr(z) f(z) = 9, 


where the coefficients c;(z) are analytic in |z| < 1, except for possibly a pole at 
some © satisfying \¢| < pi,¢ 4 0. Assume that ¢ is a regular singular point and no 
two roots of the indicial equation at ¢ differ by an integer. Then, there exist scalar 


constants 1,..., A € C such that for any po with \C| < po < pi, one has 
(105) [z"|F(z) = >> AsAg(n) + 0 (09), 
j=l 


where the A;(n) are of the asymptotic form 


106 A neon 
( ) j(n) oe T(=0,)° 


i+S5 —4), 
rc ys 


and the 0; are the roots of the indicial equation at ¢. 


PROOF. The coefficients ,; relate the particular solution f(z) to the basis of solu- 
tions (104). The rest, by singularity analysis, is nothing but a direct transcription to 
coefficients of the solutions provided by the structure theorem, Theorem VII.9, with 


Aj(n) = [2"](2 — 6)" H(z — ¢). 
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Taking into account multiple roots (as in Note 37) and roots differing by as inte- 
ger, we see that solutions to meromorphic linear ODEs, in the regular case at least, are 
only composed of linear combinations of asymptotic elements of the form!’ 


(107) 67" n? (log n)*, 


where ¢ is determined as root of a (possibly transcendental) equation, co(¢) = 0, the 
number (3 is an algebraic quantity (over the field of constants 0;) determined by the 
polynomial equation J(—G — 1) = 0, and £ is an integer. 

The coefficients 4; serve to “connect” the particular function of interest, f(z) 
to the local basis of singular solutions (104). Their determination thus represents a 
connection problem (see pp. 448 and 481). However, contrary to what happens for 
algebraic equations, the determination of the A; can only be approached in all gen- 
erality by numerical methods [203]. (Even when the coefficients d;(z) € Q(z) are 
rational fractions, no decision procedure is available to decide, from an f(z) € Q[z] 
determined by initial conditions at 0, which of the connection coefficients A; may 
vanish.) In many combinatorial applications the calculations can be carried out ex- 
plicitly, in which case the forms (107) serve as a beacon of what to expect asymptot- 
ically. (Once existence of such forms is granted, e.g., by Theorems VII.9 and VII.10, 
it is often possible to identify coefficients and/or exponents in asymptotic expansions 
directly.) Similar considerations apply to functions defined by systems of linear dif- 
ferential equations (Note 41 below). 
> VII.39. Multiple singularities. In the case of several singularities ¢1,...,¢s, a sum of s 
terms, each of the form (106) with ¢ — ¢;, expresses [z”] f(z). [The structure theorem applies 


at each ¢; and singularity analysis is known to adapt to multiple singularities, Section VI.5, 
p. 381.] <J 


> VIL.40. A relaxation. In Theorem VII.10, one may allow the equation to have a singularity 
of any kind at 0. [Only properties of the basis of solutions near ¢ are used.] <q 


> VII.41. Equivalence between equations and systems. A (first-order) linear differential system 
is by definition 


d 
—Y(z)=A(z)Y 
SY (2) = AY), 
where Y = (Y1,..., Yin) is an m-dimensional column vector and A is an m X m coefficient 


matrix. A differential equation of order m can always be reduced to a system of dimension m, 
and conversely. Only rational operations and derivatives are involved in each of the conver- 
sions: technically, coefficient manipulations take place in a differential field IK that contains 
coefficients of recurrences and systems. (For instance, the set of rational functions C(z) and 
the set of meromorphic functions in an open set () are differential fields.) 

The proofs are simple extensions of the case m = 2. Starting from the equation y”’ + by’ + 
cy = 0, one sets Y; = y, Y2 = y’ to get the system 


{oy = Yo, OY2 = —cY, = bY2}. 
Conversely, given the system 
{OY%1 =anYitaizY2, OY2 = a21Y1 + a22Yo}, 


'7The forms (107) are appreciably more general than the corresponding ones arising in algebraic 
coefficient asymptotics (Theorem VII.8, p. 477), since, there, no logarithmic term can be present and the 
exponents are constrained to be rational numbers only, reflecting the Newton-Puiseux expansion. 
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Nw NE Sw se 


FIGURE VII.20. The quadtree splitting process (left); a hierarchical partition associated 
to n = 50 random points. 


let E = VS[Y1, Yo] be the vector space over K spanned by Y1, Y2, which is of dimension < 2. 


Differentiation of the relation OY; = a11Y; + a12Y2 shows that 0?Y; can be expressed as 
combination of Yi, Y2, 


8?Y, = a4 ¥i + a42¥o 4 a11(a11Y1 + a12Y2) + a12(a21¥1 + a22Y2), 


hence 07Yj lies in €. Thus, the system {Y1, 0Y1, dY?} is bound, which corresponds to a 
differential equation of order 2 being satisfied by Y;. (In the case where the coefficient matrix A 


has a simple pole at ¢, singularities of solutions can be studied by matrix methods akin to those 
of Note 37. <q 


Combinatorial applications. The quadtree is a tree structure, discovered by Fin- 
kel and Bentley [166], that can be superimposed on any sequence of points in some 
Euclidean space R?. In computer science, it forms the basis of several algorithms for 
maintaining and searching dynamically varying geometric objects [428]. Quadtrees 
are associated to differential equations, whose order is equal to the dimension of the 
underlying space. Some of their major characteristics can be determined via singular- 
ity analysis of these equations [183, 194]. 


EXAMPLE VII.21. The plain quadtree. Start from the unit square Q = (0, 1]? and 
let p = (Pi,..., Pn), where P; = (ax;,y;), be a sequence of n points drawn uniformly 
and independently from Q. A quaternary tree, called the quadtree and noted QT(p), is built 
recursively from p as follows: 

— ifp is the empty sequence (n = 0), then QT(p) = 0 is the empty tree; 

— otherwise, let pyw,pnz,Psw,Psze be the four subsequences of points of p that lie 
respectively North-West, North-East, South-West, South-East of P,. For instance 
pow ispsw = (Pj,, Pio,---,Pj,), where 1 < ji < jo,--+ < je <n, and the 
P3, = (Xe, ye) are those points, which satisfy the predicate x;, < x1 and yj, < y1. 
Then QT(p) is 


QT(p) = (Pi; QT(pww), QT (prez), QT(psw), QT(pse)). 


In other words, the sequence of points induces a hierarchical partition of the space QT; see 
Figure 20. (For simplicity, the tree is only defined here for points having different x and y co- 
ordinates, an event that has probability 1.) 
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Quadtrees are used for searching in two related ways: (i) given a point Po = (Xo, yo), 
exact search aims at determining whether Po occurs in p; (77) given a coordinate xo € [0,1], a 
partial-match query asks for the set of points P = (x, y) occurring in p such that x = 2o (irre- 
spective of the values of y). Both types are accommodated by the quadtree structure: an exact 
search corresponds to descending in the tree, following a branch guided by the coordinates of 
the point Po that is sought; partial match is implemented by recursive descents into two subtrees 
(either the pair NW, SW or NE, SE) based on the way xo compares to the x coordinate of 
the root point. In an ideal world for computers, trees are perfectly balanced, in which case the 
search costs satisfy the approximate recurrences, 


for exact search and partial match respectively. The solutions of these recurrences are © log, n 
and = \/n, respectively. 

To what extent do randomly grown quadtrees differ from the perfect shape, and what is the 
growth of the cost functions on average? The answer lies in the singularities of certain linear 
differential equations. 

Our purpose is to set up tree recurrences in the spirit of Subsection VI. 10.3, p. 409. We 
need the probability 7,,,, that a quadtree of size n gives rise to a NW root-subtree of size k and 
claim that 

1 1 1 
(109) Tn,k = — (Hn — Hy), Ay, =14+54+---4+-. 
n 2 n 


Indeed, the probability that 2 elements are West of the root and k are North-West is 


=] lpl 7 ges 
(110) wren = eis ea, i, (xy)*(a(1 — y))* "(1 — 2)” der dy. 


(The double integral is the probability that the first k elements fall NW, the next @—k fall SW, 
the rest fall EF; the integrand corresponds to a conditioning upon the coordinates (x,y) of the 
root; the multinomial coefficient takes into account the possible shufflings.) The Eulerian Beta 
integral (p. 693) simplifies the integrals to @n,¢,z = 1/(n(£+1)), from which the claimed (109) 
follows by summation over £. (It is also possible, though less convenient, to develop equations 
starting from basic principles of the symbolic method.) 
Given (109), the recurrence 

n-1 
(111) Ph =n+4 5° tnnPe, Po = 0, 

k=0 
with 7, as in (109), determines the sequence of expected value of path length. This recurrence 
translates into the integral equation, 


5 2) at : du 
(112) P= qa tf meat. ay 


itself equivalent to the linear differential equation of order 2, 
2(1— z)*P"(z) + (1— 2z)(1 — 2)? P’(z) — 4(1 — z)?P(z) = 1432. 


The homogeneous equation has a regular singularity at z = 1. In such a simple case, it is not 
difficult to guess the “right” solution, which can then be verified by substitution: 
1 1422 1 1 4z +42? 1 n+1 
P(z) = = ———,; log —— + = — PS =| An- ; 
@=3G—2) 1-2 + 4a (n+5) 6n 
The ratio P,,/n represents the mean level of a random node in a randomly grown quadtree, a 
quantity which is thus logn + O(1). Accordingly, quadtrees are on average fairly balanced, 
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the expected level being within a factor log 4 = 1.38 of the corresponding quantity in a perfect 
tree. 

The analysis of partial match reveals a curious consequence of the imbalance of quadtrees, 
where the order of growth differs from what the perfect tree model (108) predicts. The recur- 
rence satisfied by the expected cost of a partial match query is determined by methods similar 
to pathlength [183]. One finds, by a computation similar to (110), 


4 n-1 
(113) oe) arp 2 Se Qo =0, 


corresponding, for the GF Q(z) = >> Qnz”, to the inhomogeneous differential equation, 
£[Q(z)] = 2/(1 — z), where the differential operator £ is 


(114) Lf] = 2(1 — z)?o7 f + 2(1 — 2)? Of — 4f. 


A particular solution of the inhomogeneous equation is —1/(1 — z), so that y(z) := Q(z) + 
1/(1 — z) satisfies the homogeneous equation £[y] = 0. 

The differential equation £[y] = 0 is singular at z = 0,1, +00 and it has a regular sin- 
gularity at z = 1. Since one has yn = O(n), by the origin of the problem, the singularity 
at z = 1 is the one that matters. The indicial polynomial can be computed from its definition 
or, equivalently, by simply substituting y = (z— 1)9 in the definition of £ and discarding lower 
order terms. One finds, with Z = z — 1: 


£[Z°| = 0(0 —1)Z° — 42° 4 Oe"), 


The roots of the indicial equations are then 
1 1 
&=s(1-vi7), %=5(1+Vi7). 
Theorem VII.9 guarantees that y(z) admits, near z = 1 a representation of the form 
(115) y(z) = Ai(1 — z)"! A(z — 1) + (1 — 2) Ho(z — 1), 
with H,, H2 analytic at 0. 

In order to complete the analysis, we still have to verify that the coefficient 41, which 
multiplies the singular element that dominates as z — 1 is nonzero. Indeed, if we had A; = 0, 
then, one would have y(z) — 0 as z — 1, which contradicts the fact that yn > 1. In other 
words, here: the connection problem is solved by means of bounds that are available from the 
combinatorial origin of the problem. Singularity analysis then yields the asymptotic form of 
Yn, hence of Q,. Summarizing , we have: 


Proposition VII.9. Path length in a randomly grown quadtree of size n is on average n log n+ 
O(n). The expected cost of a partial match query satisfies 


(116) Qn~Kk nr, a= = = 1.56155. 


The analysis extends [183] to quadtrees of higher dimensions [183]. In general dimen- 
sion d, pathlength is on average zn logn + O(n). The cost of a partial match query is of the 
order of n”, where G is an algebraic number of degree d. ..... END OF EXAMPLE VII.21. 


> VII.42. Quadtrees and hypergeometric functions. For the plain quadtree (d = 2), the change 
of variables y = (1 — z)~°n(z) reduces the differential equation £[y] = 0 to hypergeometric 


form. The constant « in (116) is then found to satisfy 


= LE (2a) _Vi7-1 


= 9T{aje” ~* 2 
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Hypergeometric solutions (Note 15, p. 696) are available for d > 2 [86, 183, 194] dq 


> VII.43. Closed meanders. A closed meander of size n is a topological configuration de- 
scribing the way a loop can cross a road 2n times. The sequence starts as 1, 1,2, 8,42, 262 
(EIS A005315). For instance, here is a meander of size 5: 


There are good reasons to believe that the number /,, of meanders satisfies 


= 
MeavCne?..-. ‘sine BS =< Hae: 
based on analogies with well-established models of statistical physics [127]. <q 


VII. 9.2. Nonlinear differential equations. Solutions to nonlinear equations do 
not necessarily have singularities that arise from the equation itself (as in the linear 
case). Even the simplest nonlinear equation, 


Yi@jesevtzy, Y(0) =a, 


has a solution Y(z) = 1/(a — z) whose singularity depends on the initial condition 
and is not visible on the equation itself. The problem of determining the location of 
singularities is nonobvious in the case of a nonlinear ODE. Furthermore, the problem 
of determining the nature of singularities for nonlinear equations defies classification 
in the general case [417, 418]. In this section, we thus limit ourselves to examining 
a few examples where enough structure is present in the combinatorics, so that fairly 
explicit solutions are available, which are then amenable to singularity analysis. 


EXAMPLE VII.22. Increasing varieties of trees. Consider a labelled class defined by either of 
(117) VY = Z" x SEQQ(Y), Y = Z° « SETA(Y), 


where a set of integers (2 C Zo has been fixed. This defines trees that are either plane (SEQ) 
or nonplane (SET) and increasing, in the sense that labels go in increasing order along any 
branch stemming from the root. Such trees have been encountered in Subsection IL. 6.3 (p. 129) 
in relation to alternating permutations, general permutations, and regressive mappings. By 
the symbolic translation of the boxed product, the EGF of ¥ satisfies a nonlinear differential 
equation 


(118) ¥(a)= f 6(v(w))du, 
0 
where the structure function ¢ is 
oy) = doy” (case SEQ, — 6(y) = )> (case SET). 
wEQ wen 


The integral equation (118) is our starting point; in order to unify both cases, we set du := 
[y”]b(y). The discussion below is excerpted from the paper [40]. 


First note that (118) is equivalent to the nonlinear differential equation 


(119) Y"(z)=¢(Y(z)), -¥(0) =0, 
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Diff. eq. EGF p Sing. type Coeff. 
A: Y'=(1+Y) i 1 25+ Y, =n! 
z 
B: Y’=1+Y? tan z EZ ot A lal 
Cc: Y’ =e" log(1—z)"'] 1 logZ Y, = (n—-1)! 
D Yeo> 1-JI-2z 4 Zi? Yn = (2n —3)! 


FIGURE VII.21. Some classical varieties of increasing trees: (A) plane binary; (B) strict 
plane binary; (C) increasing Cayley; (D) increasing plane. 


which implies that Y’/(Y) = 1 and, upon integrating back, 


Y(z) gq y d 
n : Ul 
(120) / —~ =%, 1.6%, K(Y(z))=2, K(y):= —_—. 
ty) O(n) o O(n) 
Thus, the EGF Y (z) is the compositional inverse of the integral of the multiplicative inverse of 
the structure function. We can visualize this chain of transformation as follows: 


(121) Y=Inv o / ° > o. (of. 


In simpler situations, the integration defining K (y) in (120) can be carried out explicitly 
and explicit expressions may become available for Y(z). Figure 21 displays data relative to 
four such classes, the first three of which already encountered in Chapter II. In each case, there 
is listed: the differential equation (from which the definition of the trees and the form of ¢ are 
apparent), the dominant positive singularity, the singularity type, and the corresponding form 
of coefficients. The general analytic expressions of (120) contain much more: they allow for a 
general discussion of singularity types and permit us to analyse asymptotically classes that do 
not admit of an explicit GF. 

Assume for simplicity ¢ to be an aperiodic entire function (possibly a polynomial). Let 
p be the radius of convergence of Y(z), which is a singular point (by Pringsheim’s Theorem). 
Consider the limiting value Y (p). One cannot have Y (p) < oo since then K (z) being analytic 
at Y(p) would be analytically invertible (by the implicit functions theorem). Thus, one must 
have Y(p) = +00 and, since Y and K are inverses of each other, we get K (+00) = p. The 
radius of convergence of Y (z) is accordingly determined as 


dn 
) o(n) 


The singularity type of Y (z) is then systematically determined by the rules (121). For a general 
polynomial of degree d > 2, we have (ignoring coefficients) 


ea 
K (+00) — K(y) = iy af ay tt Y(z) 2 ZV) with Z := (p — 2). 
y 


This back-of-the-envelope calculation shows that 


(122) for ¢ a polynomial of degree d : Yn ~ Cnlnf, with f= oe 


In the same vein, the logarithmic singularity of the EGF of increasing Cayley trees (Case C' of 
Figure 21) appears as eventually reflecting the inverse of the exponential singularity of ¢(y) = 
e”. Such a singularity type must then be systematically present when considering increasing 
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nonplane trees (increasing Cayley trees) with a finite collection of node degrees excluded— 
in other words, whenever the SET constructor is used in (117) and Q is a cofinite set. This 
observation “explains” and extends an analysis of [358]. 


Consider next an additive parameter of trees!* defined by a recurrence, 
(123) s(r) = ti, + S> s(v), 


where (¢,) is a numeric sequence of “tolls” with to = 0, and the summation v «x 7 is carried 
out over all root subtrees vu of 7. Introduce the two functions (of cumulated values) 


Sea=> sr, Te = yon 


Tey n>0 


zitl 


I7|! 


[2"]S(z) 
[2"]¥ (2) 
size n. By simple algebra similar to Lemma VII.1 (p. 439), it is found that the GF Sz) is 


so that the ratio 


equals the mean value of parameter s taken over all increasing trees of 


(124) S(z) =Y"(z) a iat dw. 


The relation (117) defines an integral transform T +> S, which can be viewed as a singularity 
transformer. Thanks to the methods of Subsection VI. 10.3, p. 409, its systematic study is 
doable, once the singularity type of Y (z) is known. 

The discussion of path length (t, = n corresponding to T(z) = zY’(z)) is conducted 
in the present perspective as follows. For polynomial varieties of increasing trees, we have 
Y(z) » Z~ with 6 = 1/(d — 1), so that 

ae 5-2 T" x T’ 1 
TRY eZ", eZ, eZ, we [ Ze lo8z 
Thus, the relation between Y and S is of the simplified form S ~ Y'log Z. Singularity 
analysis, then implies that average path length is of order n logn. Working out the constants 
gives: 
Proposition VII.10. Let Y be an increasing variety of trees defined by a function ¢ that is an 
aperiodic polynomial of degree d > 2 and let 6 = 1/(d — 1). The number of trees of size n 


satisfies 
n! 5 ) apis ~ dn 
Ya~ aml) e > Pi= aT" 
T(6) G 0 9(n) 
The expected value of path length on a tree of Yn is (6 + 1)nlogn + O(n). 


For naturally occurring models like those of Figure 21 and more, many parameters of in- 
creasing tree varieties can be analysed in a synthetic way (e.g., the degree profile, the level 
profile [40]). What stands out is the type of conceptual reasoning afforded by singularity anal- 
ysis, which provides a direct path to the right order of magnitude of both combinatorial counts 
and basic parameters of structures. After this, it suffices to do the bookkeeping and get the 
Constants right) 3.4.3... veeeeagedeod tages Bev eeae oAG DAS bas END OF EXAMPLE VII.22. 


It is of interest to compare the properties of increasing trees and of simple vari- 
eties of trees (examined in Subsection VII. 3.2, p. 436). The conclusion is that simple 
trees are of the “square-root” type, in the sense that the typical depth of a node and 
the expected height are of order \/n. By contrast, increasing trees, which are strongly 


'8Such parameters have been investigated in Subsection VI. 10.3 (p. 409), and the binary search tree 
recurrence there corresponds exacty to the case #(w) = (1 + w)? here. 
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bound by an order constraint, have logarithmic depth and height [123, 124, 126]— they 
belong to a “logarithmic” type. From a singular perspective, simple trees are associ- 
ated to the universal Z!/? law, while increasing trees exhibit a divergence behaviour 
(Z—‘/(4-)) in the polynomial case). Tolls then affect singularities of GFs in rather 
different ways: through a factor Z—!/? for simple trees, through a factor log Z in the 
case of increasing trees. Such abstract observations are typical of the spirit of analytic 
combinatorics. 


An interesting example of the joint use of nonlinear ODEs and singularity analy- 
sis is provided by urn processes of probability theory. There, an urn may contain balls 
of different colours. A fixed set of replacement rules is given (one for each colour). At 
any discrete instant, a ball is chosen uniformly at random, its colour is inspected, and 
the corresponding replacement rule is applied. The problem is to determine the evolu- 
tion of the urn at a large instant n. In the case of two colours and urns called balanced, 
it is shown in [99, 178] that the generating function of urn histories is determined by 
a nonlinear first-order autonomous equation, from which many characteristics of the 
urn can be effectively analysed. 

A spectacular result in the general area of random discrete structures and nonlin- 
ear differential equations is the discovery by Baik, Deift, and Johansson (Note VIII.20, 
p. 540) of the law governing the longest increasing subsequence in a random permu- 
tation. There, the solutions of the nonlinear Painlevé equation 


u" (x) = 2u(x)? + 2u(z) 


play a central rdle. 


VII. 10. Perspective 


The theorems in this chapter demonstrate the central rdle of the singularity anal- 
ysis theory developed in Chapter VI, this in a way that parallels what Chapter V did 
for Chapter IV with meromorphic function analysis. Exploiting properties of complex 
functions to develop coefficient asymptotics for abstract schemas help us solve whole 
classes of combinatorial constructions at once. 

Within the context of analytic combinatorics, the results in this chapter have broad 
reach, and bring us closer to our ideal of a theory covering full analysis of combi- 
natorial objects of any “reasonable” description. Analytic side conditions defining 
schemas often play a significant rdle. Adding in this chapter the mathematical support 
for handling set constructions (with the exp-log schema) and context-free construc- 
tions (with coefficient asymptotics of algebraic functions) to the support developed 
in Chapter V to handle the sequence construction (with the supercritical sequence 
schema) and regular constructions (with coefficient asymptotics of rational functions) 
gives us general methods encompassing a broad swath of combinatorial analysis, with 
a great many applications (Figure 22). 

Together, the methods covered in Chapter V, this chapter, and, next, Chapter VIII 
(relative to the saddle point method) apply to virtually all of the generating functions 
derived in Part A of this book by means of the symbolic techniques defined there. 
The SEQ construction and regular specifications lead to poles; the SET construction 
leads to algebraic singularities (in the case of logarithmic generators discussed here) or 
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Combinatorial Type Coeff: asymptotics (subexp. term) 

Rooted maps n §VIL. 8.2 
Unrooted trees . §VIL.5 

Rooted trees 8VII. 3, §VII.4 
Excursions §VIL. 8.1 
Bridges §VIL. 8.1 
Mappings §VII. 3.3 


Exp-log sets §VII.2 
Increasing d-ary trees gee Pe) §VII. 9.2 


Analytic form Singularity type — Coeff. asymptotics 
Positive irred. (polynomial syst.) Zile Cn 3/7 §VIL 6 
General algebraic ZP/4 Cohn e/a} §VIL.7 


Regular singularity (ODE) Z5 (log Z)* 6-"n 8" (logn)* — §VIL.9.1 


FIGURE VII.22. A collection of universality laws summarized by the subexponential 
factors involved in the asymptotics of counting sequences (top). A summary of the main 
singularity types and corresponding asymptotic coefficient forms of this chapter (bottom). 


to essential singularities (in most of the remaining cases discussed in Chapter VII); 
recursive (context-free) constructions lead to square-root singularities. The surpris- 
ing end result is that the asymptotic counting sequences from all of these generating 
functions have one of just a few functional forms. This universality means that com- 
parisons of methods, finding optimal values of parameters, and many other outgrowths 
of analysis can be very effective in practical situations. Indeed, because of the nature 
of the asymptotic forms, the results are often exceedingly accurate, as we have seen 
repeatedly. 

The general theory of coefficient asymptotics based on singularities has many ap- 
plications outside of analytic combinatorics (see the notes below). The broad reach of 
the theory provides strong indications that universal laws hold for many combinatorial 
constructions and schemas yet to be discovered. 


The exp-log schema, like its companion, the supercritical-sequence schema, illustrates the 
level of generality that can be attained by singularity analysis techniques. Refinements of the 
results we have given can be found in the book by Arratia, Barbour, and Tavaré [16], which 
develops a stochastic process approach to these questions; see also [15] by the same authors for 
an accessible introduction. 

The rest of the chapter deals in an essential manner with recursively defined structures. As 
noted repeatedly in the course of this chapter, this is often conducive to square-root singularity 
and universal behaviours of the form n~°/?. Simple varieties of trees have been introduced 
in an important paper of Meir and Moon [356], that bases itself on methods developed earlier 
by Pélya [395, 397] and Otter [382]. One of the merits of [356] is to demonstrate that a high 
level of generality is attainable when discussing properties of trees. A similar treatment can be 
inflicted more generally to recursively defined structures when their generating function satisfies 
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an implicit equation. In this way, nonplane unlabelled trees are shown to exhibit properties very 
similar to their plane counterparts. It is of interest to note that some of the enumerative questions 
in this area had been initially motivated by problems of theoretical chemistry: see the colourful 
account of Cayley and Sylvester’s works in [54], the reference books by Harary—Palmer [259] 
and Finch [165], as well as Pélya’s original studies [395, 397]. 


Algebraic functions are the modern counterpart of the study of curves by classical Greek 
mathematicians. They are either approached by algebraic methods (this is the core of algebraic 
geometry) or by transcendental methods. For our purposes, however, only rudiments of the 
theory of curves are needed. For this, there exist several excellent introductory books, of which 
we recommend the ones by Abhyankar [1], Fulton [223], and Kirwan [292]. On the algebraic 
side, we have striven to provide an introduction to algebraic functions that requires minimal 
apparatus. At the same time the emphasis has been put somewhat on algorithmic aspects, since 
most algebraic models are nowadays likely to be treated with the help of computer algebra. 
As regards symbolic computational aspects, we recommend the treatise by von zur Gathen and 
Jiirgen [487] for background, while polynomial systems are excellently reviewed in the book 
by Cox, Little, and O’ Shea [104]. 

In the combinatorial domain, algebraic functions have been used early: in Euler and Seg- 
ner’s enumeration of triangulations (1753) as well as in Schréder’s famous “Vier combina- 
torische Probleme” described by Stanley in [449, p. 177]. A major advance was the realization 
by Chomsky and Schiitzenberger that algebraic functions are the “exact” counterpart of context- 
free grammars and languages (see their historic paper [89]). A masterful summary of the early 
theory appears in the proceedings edited by Berstel [45] while a modern and precise exposi- 
tion forms the subject of Chapter 6 of Stanley’s book [449]. On the analytic-asymptotic side, 
many researchers have long been aware of the power of Puiseux expansions in conjunction with 
some version of singularity analysis (often in the form of the Darboux—Polya method: see [397] 
based on Pélya’s classic paper [395] of 1937). However, there appeared to be difficulties in cop- 
ing with the fully general problem of algebraic coefficient asymptotics [77, 361]. We believe 
that Section VII.7 sketches the first complete theory (though most ingredients are of folklore 
knowledge). In the case of positive systems, the “Drmota-Lalley-Woods” theorem is the key to 
most problems encountered in practice—its importance should be clear from the developments 
of Section VIL. 6. 


The applications of algebraic functions to context-free languages have been known for 
some time (e.g., [174]). Our presentation of 1-dimensional walks of a general type follows 
a recent article by Banderier and Flajolet [21], whch can be regarded as the analytic pendant 
of algebraic studies by Gessel [231, 232]. The kernel method has its origins in problems of 
queueing theory and random walks [158, 159] and is further explored in an article by Bousquet- 
Mélou and PetkovSek [70]. The algebraic treatment of random maps by the quadratic method 
is due to brilliant studies of Tutte in the 1960’s: see for instance his census [471] and the 
account in the book by Jackson and Goulden [244]. A combinatorial-analytic treatment of 
multiconnectivity issues is given in [22], where the possibility of treating in a unified manner 
about a dozen families of maps appears clearly. 


Regarding differential equations, an early (and at the time surprising) occurrence of terms 
of the form n™, with a an algebraic number, in an asymptotic expansion is found in the 
study [203], dedicated to multidimensional search trees. The asymptotic analysis of coeffi- 
cients of solutions to linear differential equations can also, in principle, be approached from the 
recurrences that these coefficients satisfy. Wimp and Zeilberger [498] propose an interesting 
approach based on results by George Birkhoff and his school (e.g., [57]), which are relative to 
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difference equations in the complex plane. There are however some doubts amongst special- 
ists regarding the completeness of Birkhoff’s programme. By contrast, the (easier) singularity 
theory of linear ODEs is well established, and, as we showed in this chapter, it is possible— 
in the regular singular case at least—to base on it a sound method for asymptotic coefficient 
extraction. 


VIII 


Saddle Point Asymptotics 


Like a lazy hiker, the path crosses the ridge at a low point; 
but unlike the hiker, the best path takes the steepest ascent to the ridge. 
[--- ] The integral will then be concentrated in a small interval. 


— DANIEL GREENE AND DONALD KNUTH [259, sec. 4.3.3] 
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A saddle point of a surface is a point reminiscent of the inner part of a saddle or of a 
geographical pass between two mountains. If the surface represents the modulus of an 
analytic function, saddle points are simply determined as the zeros of the derivative of 
the function. 

In order to estimate complex integrals of an analytic function, it is often a good 
strategy to adopt as contour of integration a curve that “crosses” one or several of 
the saddle points of the integrand. When applied to integrals depending on a large 
parameter, this strategy provides in many cases accurate asymptotic information. In 
this book, we are primarily concerned with Cauchy integrals expressing coefficients of 
generating functions of large index. The implementation of the method is then fairly 
simple, since integration can be performed along a circle centred at the origin. 

The saddle point method can lead to accurate asymptotic estimates, including 
complete asymptotic expansions. Its principle is to use a saddle point crossing path, 
then estimate the integrand locally near this saddle point (where the modulus of the 
integrand achieves its maximum on the contour), and deduce, by local approximations 
and termwise integration, an asymptotic expansion of the integral itself. Some sort of 
“localization” or “concentration” property is required to ensure that the contribution 
near the saddle point captures the essential part of the integral. A simplified form of 
the method provides what are known as saddle point bounds—these are useful and 
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technically simple upper bounds obtained by applying trivial bounds to an integral 
relative to a saddle point crossing path. 

In the context of analytic combinatorics, the method applies well to Cauchy coef- 
ficient integr als, in the case of rapidly varying functions: typical instances are entire 
functions as well as functions with singularities at a finite distance that exhibit some 
form of exponential growth. Saddle point analysis then complements singularity anal- 
ysis whose scope is essentially the category of functions having only moderate (i.e., 
polynomial) growth at their singularities. The saddle point method is also a method of 
choice for the analysis of coefficients of “large powers” of some fixed function and, 
in this context, it paves the way to the study of multivariate asymptotics and limiting 
distributions developed in the next chapter. 


Applications are given here to Stirling’s formula, as well as the asymptotics of 
the central binomial coefficients, the involution numbers and the Bell numbers asso- 
ciated to set partitions. The asymptotic enumeration of integer partitions is one of the 
jewels of classical analysis and we provide an introduction to this rich topic where 
saddle points give access to effective estimates of an amazingly good quality. Other 
combinatorial applications include balls-in-bins models and capacity, the number of 
increasing subsequences in permutations, and blocks in set partitions. The counting 
of acyclic graphs (equivalently forests of unrooted trees), takes us beyond the basic 
paradigm of simple saddle points by making use of so-called “monkey saddles”. 


The plan of this chapter is as follows. First, we examine the surface determined by the 
modulus of an analytic function and give a classification of points into three kinds: ordinary 
points, zeros, and saddle points (Section VIII. 1). Next we develop general purpose saddle 
point bounds in Section VIII.2, which also serves to discuss the properties of saddle point 
crossing paths. The saddle point method per se is presented in Section VIII. 3, both in its most 
general form and in the way it specializes to Cauchy coefficient integrals. Section VII. 4 then 
discusses three examples, involutions, set partitions, and fragmented permutations, which help 
us get further familiarized with the method. We next jump to a new level of generality and in- 
troduce in Section VIII.5 the abstract concept of admissibility—this approach has the merit of 
providing easily testable conditions, while opening the possibility of determining broad classes 
of functions to which the saddle point method is applicable. In particular, many combinatorial 
types whose leading construction is a SET operation are seen to be “automatically” amenable 
to saddle point analysis. The case of integer partitions, which is technically more advanced, 
is treated in a separate section, Section VIII.6. The framework of “large powers”, developed 
in Section VIII. 7 constitutes a combinatorial counterpart of the central limit theorem of proba- 
bility theory, and as such it provides a bridge to the study of limit distributions in Chapter IX. 
Other applications to discrete probability distributions are quickly examined in Section VIII. 8. 
Finally, Section VIII. 9 serves as a brief introduction to the rich subject of multiple saddle points 
and coalescence. 


VIII. 1. Landscapes of analytic functions and saddle points 


This section introduces a well-known classification of points on the surface rep- 
resenting the modulus of an analytic function. In particular, as we are going to see, 
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Ordinary point Zero Saddle point 


FIGURE VIII.1. The different types of points on a surface | f(z)|: an ordinary point, a 
zero, a simple saddle point. Top: a diagram showing the local structure of level curves (in 
solid lines), steepest descent lines (dashed with arrows pointing towards the direction of 
increase) and regions (hashed) where the surface lies below the reference value | f(zo)|. 
Bottom: the function f(z) = cosh z and the local shape of | f(z)| near an ordinary point 
(in /4), a zero (i /2), and a saddle point (0), with level lines shown on the surfaces. 


saddle points, which are determined by roots of the function’s derivative, are associ- 
ated with a simple geometric property that gives them their name. 

Consider any function f(z) analytic for z € 2, where 2 is some domain of C. Its 
modulus | f (a+ 7y)| can be regarded as a function of the two real quantities, « = It(z) 
and y = S(z). As such, it can be represented as a surface in three dimensional space. 
This surface is smooth (because analytic functions are infinitely differentiable), but far 
from being arbitrary. 

Let zo be an interior point of 2. The local shape of the surface | f(z)| for z near 
zo depends on which of the initial elements in the sequence f (zo), f’(zo), f” (Zo), - ++ 
vanish. As we are going to see, its points can be of only one of three types: ordinary 
points (the generic case), zeros, and saddle points; see Figure 1. The classification of 
points is conveniently obtained by considering polar coordinates, writing z = z + 
re’, with r small. 


An ordinary point is such that f (zo) 4 0, f’(z0) 4 0. This is clearly the generic 
situation as analytic functions have only isolated zeros. In that case, one has for small 
r > 0: 


(1) [F(2)] = |£ Go) + re” f'(z0) + O(7")| = [Ff (20)] [1 + Aret*9 + O("?)], 
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where we have set f’(zo)/f (zo) = Ae’®. The modulus then satisfies 
f(z) = |f(z0)| (1 + Ar cos( + ) + O(r?)). 


Thus, for r kept small enough and fixed, as 0 varies, | f(z)| is maximum when 0 = —@ 
(where it is ~ 1 +7), and minimum when 6 = —¢ + a (where it is ~ 1 — r). When 
6 = —@ + $, one has | f(z)| = |f(2o)| + 0(r), which means that | f(z)| is essentially 
constant. This is easily interpreted: the line 9 = —¢ (mod 7) is (locally) a steepest 
descent line, the perpendicular line 9 = —¢ + $ (mod 7) is locally a level line. 
In particular, near an ordinary point, the surface | f(z)| has neither a minimum nor a 


maximum. In figurative terms, this is like standing on the flank of a mountain. 


A zero is by definition a point such that f (zo) = 0. In this case, the function | f (z)| 
attains its minimum value 0 at zo. Locally, to first order, one has |f(z)| ~ |f’(zo)|r- 
A zero is thus like a sink or the bottom of a lake, save that, in the landscape of an 
analytic function, all lakes are at see level. 


A saddle point is a point such that f(z0) 4 0, f’(zo) = 0. It is said to be a simple 
saddle point if furthermore f” (zo) 4 0. In that case, a calculation similar to (1), 


(2) : 
If) = |f@0) + 572e? F(z) + OC") 


= |f(20)| 1 + dr2ei(29+4) 4 O(r3)| , 


where we have set $f” (zo)/f(z0) = Ae’®, shows that the modulus satisfies 
|f(2)| = |f(z0)| (1 + Ar? cos(20 + ¢) + O(r?)) . 


Thus, starting at the direction 6 = —@/2 and turning around zo, the following se- 
quence of events regarding the modulus |f(z)| = |f(z)| is observed: it is maximal 
(0 = —/2), stationary (9 = —¢/2 + 4), minimal (9 = —¢/2 + 7), stationary, 
(6 = —6/2+ St), maximal again (9 = —¢/2 + 7), and so on. The pattern, symbol- 
ically ‘+=—=’, repeats itself twice. This is superficially similar to an ordinary point, 
save for the important fact that changes are observed at twice the angular speed. Ac- 
cordingly, the shape of the surface looks quite different; it is like the central part of a 
saddle. Two level curves cross at a right angle: one steepest descent line (away from 
the saddle point) is perpendicular to another steepest descent line (towards the saddle 
point). In a mountain landscape, this is thus much like a pass between two mountains. 
The two regions on each side corresponding to points with an altitude below a simple 
saddle point are often referred to as “valleys”. 

Generally, a multiple saddle point has multiplicity p if f (zo) 4 0 and all deriva- 
tives f’(zo),..., f\?)(zo) are equal to zero while f‘?+!)(zo) 4 0. In that case, the 
basic pattern ‘+ =—=’ repeats itself p + 1 times. For instance, from a double saddle 
point (p = 2), three roads go down to three different valleys separated by the flanks of 
three mountains. A double saddle point is also called a “monkey saddle” since it can 
be visualized as a saddle having places for the legs and the tail: see Figure 11 (p. 559) 
and Figure 13 (p. 562). 


Theorem VIIL.1 (Classification of points on modulus surfaces). A surface | f(z)| at- 
tached to the modulus of a function analytic over an open set Q has points of only 
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AS -1.0 ~0. 


FIGURE VIII.2. The “tripod”: two views of |1 + z+ 2? + z3| as function of « = R(z), 
y = &(z): (left) the modulus as a surface in R®; (right) the projection of level lines on 
the z-plane. 


three possible types: (i) ordinary points, (i7) zeros, (77) saddle points. Under projec- 
tion on the complex plane, a simple saddle point is locally the common apex of two 
curvilinear sectors with angle 5, referred to as “valleys”, where the modulus of the 
function is smaller than at the saddle point. 

As a consequence, the surface defined by the modulus of an analytic function has 
no maximum: this property is known as the Maximum Modulus Principle. It has no 
minimum either, apart from zeros. It is therefore a peakless landscape in de Bruijn’s 
words [111]. Accordingly, for a meromorphic function, peaks are at oo and minima 
are at 0, the other points being either ordinary or saddle points. 

EXAMPLE VIII.1. The tripod: a cubic polynomial. An idea of the typical shape of the surface 
representing the modulus of an analytic function can be obtained by examining Figure 2 relative 
to the third degree polynomial f(z) = 1+ 2+ 2° + 2°. Since f(z) = (1 — 2*)/(1 — 2), the 
Zeros are at 

a ar 
There are saddle points at its derivative has zeros at the zeros of the derivative f’(z) = 1+ 
22 + 327, that is, at the points 


ce ve. SD, 2a 
=o 4-V2 Se ate: 
¢ 3 t v2 C=3 ha 


The diagram below summarizes the position of these “interesting” points: 


® 2 (zero) 
F — 4/2 (saddle point) 


(0) 
— $V2 (saddle point) 


(3) 


—i (zero) 
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The three zeros are especially noticeable on Figure 2 (left), where they appear at the end of the 
three “legs”. The two saddle points are visible on Figure 2 (right) as intersection points of level 
CURVES: iis dated to GaGa sade bese aati oe Dea See END OF EXAMPLE VIII.1. 


> VIIL1. The Fundamental Theorem of Algebra. This theorem asserts that a polynomial has at 
least one root (hence n roots if its degree is n). Let P(z) = 1+a1z+---anz” bea polynomial 
of degree n. Consider f(z) = 1/P(z). By basic analysis, one can take R sufficiently large, so 
that on |z| = R, one has |f(z)| < 4. Assume a contrario that P(z) has no zero. Then, f(z) 
which is analytic in |z| < R should attain its maximum at an interior point (since f(0) = 1), 
so that a contradiction has been reached. 


> VII.2. Saddle points of polynomials and the convex hull of zeros. Let P be a polynomial 
and H the convex hull of its zeros. Then any root of P’(z) lies in H. (Proof: assume distinct 


zeros and consider 
P'(z) 1 
(Zz) = — ; 
P(z) a: a Ae 


If z lies outside 7, then z sees all zeros a in a half-plane, this by elementary geometry. By 
projection on the normal to the half plane boundary, there results that, for some 6, one has 


Re"? b(z)) < 0, so that P’(z) 4 0.) <J 


VIII. 2. Saddle point bounds 


Saddle point analysis is a general method suited to the estimation of integrals of 
analytic functions F'(z), 


B 
(4) l= | F(z) dz, 
A 

where F(z) = F,,(z) involves some large parameter n. The method is instrumental 
when the integrand F is subject to rather violent variations, typically when there oc- 
curs in it some exponential or some fixed function raised to a large power n — +00. 
In this section, we discuss some of the global properties of saddle point contours, then 
specialize the discussion to Cauchy coefficient integrals. General bounds, known as 
saddle point bounds, which are easy to derive, result from simple geometric consider- 
ations. 


Starting from the general form (4), we let C be a contour joining A and B and 
taken in a domain of the complex plane where F(z) is analytic. By standard inequal- 
ities, we have 


(5) [Z| < |Cl| - sup |F(z)], 
ZEC 


with |C|| representing the length of C. This is the common trivial bound from integra- 
tion theory applied to a fixed contour C. 

For an analytic integrand F’ with A and B inside the domain of analyticity, there 
is an infinite class P of acceptable paths to choose from, all in the analyticity domain 
of F’. Thus, by optimizing the bound (5), we may write 


I| < inf . F 
6) 11 < jot [Ic sup F¢e). 
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where the infimum is taken over all paths C € P. Broadly speaking, a bound of this 
type is called a saddle point bound". 

The length factor ||C|| usually turns out to be unimportant for asymptotic bound- 
ing purposes—this is for instance the case when paths remain in finite regions of the 
complex plane. If there happens to be a path C from A to B such that no point is 
at an altitude higher than sup(|F'(A)|, |'(B)|), then a simple bound results, namely, 
|I| < |C]-sup(|F(A)], |F'(B)|): this is in a sense the uninteresting case. The common 
situation, typical of Cauchy coefficient integrals of combinatorics, is that paths have 
to go at some higher altitude than the end points. A path C that traverses a saddle point 
by connecting two points at a lower altitude on the surface |F'(z)| and by following 
two steepest descent lines across the saddle point is clearly a local minimum for the 
path functional 

®(C) = sup |F(z), 
zEC 
as neighbouring paths must possess a higher maximum. Such a path is called a saddle— 
point path or steepest descent path. Then, the search for a path minimizing 


inf sup Fall 
C Lzec 


(a simplification of (6) to its essential feature) naturally leads to considering saddle 
points and saddle-point paths. This leads to the variant of (6), 
(7) |I| < |Col|- sup |F(z)|, Co minimizes sup |F'(z)|, 
z€Co zEC 
also referred to as a saddle point bound. 

We can summarize this stage of the discussion by a simple generic statement. 
Theorem VIII.2 (General saddle point bounds). Let F(z) be a function analytic in 
a domain Q. Consider the class of integral es F(z) dz where the contour y connects 
two points A, B and is constrained to a class P of allowable paths in Q) (e.g., the ones 
that encircle 0). Then one has the saddle point bounds?: 


[Fe dz 


If A and B lie in opposite valleys of a saddle point zo, then the minimization problem 
is solved by saddle point paths Cy made of arcs connecting A to B through zo. 


<  |Coll- sup |F(2)I, 
zECo 
where Co is any path that minimizes sup |F'(z)]. 
z€C 


(8) 


Borrowing a metaphor of de Bruijn [111], the situation may be described as fol- 
lows. Estimating a path integral is like estimating the difference of altitude between 
two villages in a mountain range. If the two villages are in different valleys, the best 
strategy (this is what road networks often do) consists in following paths that cross 
boundaries between valleys at passes, i.e., through saddle points. 


!Notice additionally that the optimization problem need not be solved exactly, as any approximate 
solution to (6) still furnishes a valid upper bound because of the universal character of the trivial bound (5). 

2The form given by (8) is in principle weaker than the form (6), since it does not take into account the 
length of the contour itself, but the difference is immaterial in practically all our asymptotic problems. 
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The statement of Theorem VIII.2 does no fix all details of the contour, when 
there are several saddle points “separating” A and B—the problem is like finding the 
most economical route across a whole mountain range. But at least its suggests the 
construction of a composite contour made of connected arcs crossing saddle points 
from valley to valley. Furthermore, in cases of combinatorial interest some strong 
positivity is present and the selection of the suitable saddle point contour is normally 
greatly simplified, as we explain next. 
> VIIL3. An integral of powers. Consider the polynomial P(z) = 1+z+z?+z° of Example 1. 
Define the line integral 

+i 
In = P(z)" dz. 
-1 
On the segment connecting the end point, the maximum of | P(z)| is 0.63831, giving the weak 
trivial bound J,, = O(0.63831”). In contrast, there is a saddle point at ¢ = 3 + i/2 where 
|P(¢)| = , resulting in the bound 


Il < (5) , A= (C41) 4b -¢] 2 1.44141, 


as follows from adopting a contour made of two segments connecting —1 to 7 through ¢. Discuss 


further the bounds on [ ne when (qa, «’) ranges over all pairs of roots of P. <q 


Saddle point bounds for Cauchy coefficient integrals. Saddle point bounds can 
be applied to Cauchy coefficient integrals, 


“ i. dz 
0) gn =[e"1G(2) = = $ Ge) SS, 
for which we can avail ourselves of the previous discussion, with F;,(z) = G(z)z7""1. 
In (9) the symbol ¢ indicates that the allowable paths are constrained to encircle the 
origin (the domain of definition of the integrand is a subset of C \ {0}; the points A, B 
can then be seen as coinciding and taken somewhere along the negative real line). 

In the particular case where G(z) is a function with nonnegative coefficients, there 
is usually a saddle point on the positive real axis. Indeed, assume that G(z), which has 
radius of convergence R withO < R < +on, satisfies G(x) — +-ooas x — R~. Then 
the integrand F(z) := G(z)z~"~! satisfies F(0*) = F(R~) = +00. This means 
that there exists at least one local minimum, hence, at least one positive value ¢ such 
that the derivative F’() of the real function F'(2) vanishes in (0, R). (Actually, there 
can be only one such point; see Note 4, p. 516.) But this point ¢ is also a derivative 
of the complex function F(z). Since ¢ is a local minimum, we have additionally 
F"(¢) < 0, and the saddle point is crossed transversally by a circle of radius ¢. Thus, 
the saddle point bound, specialized to circles centred at the origin, becomes: 


Corollary VIII.1 (Saddle point bounds for generating functions). Let G(z) be ana- 
lytic at O with nonnegative coefficients and radius of convergence R < +00. Assume 
that G(R~) = +co. Then one has 


(10) [z"]G(z) < GQ with ¢ € (0, R) the unique root of ¢ aS 


=n+l1. 
Ge 


VIII. 2. SADDLE POINT BOUNDS 515 


FIGURE VIII.3. The modulus of the integrands of J, (central binomials) and K’,, (in- 
verse factorials) for n = 5 and the corresponding saddle point contours. 


This corollary is very similar to Proposition IV.1, p. 233, on which it sheds a new 
light, while paving the way to the full saddle-point method to be developed in the next 
section. 


We examine below two particular cases related to the central binomial and the 
inverse factorial. The corresponding landscapes in Figure 3 which bear a surprising 
resemblance to one another are, by the previous discussion, instances of a general 
pattern for functions with nonnegative coefficients. It is seen on these two examples 
that the saddle point bounds already catch the proper exponential growths, being off 
only by a factor of O(n—!/?). 


EXAMPLE VIII.2. Saddle point bounds for central binomials and inverse factorials. Consider 
the two contour integrals around the origin, 


a In = $ (14 2)" & ~ f° & 
Qin 


gar Bn = oe Pe goat 


whose values are otherwise known, by virtue of Cauchy’s coefficient formula, to be Jn = () 
and K,, = 1/n!. In that case, one can think of the end points A and B as coinciding and taken 
somewhat arbitrarily on the negative real axis, while the contour has to encircle the origin once 
and counter-clockwise. 

The landscapes of the two integrands are represented on Figure 3. The saddle point equa- 


tions are respectively 
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; ; : 1 : : 
the corresponding saddle points being ¢ = ies 7 and ¢’ = n + 1. This provides the upper 
n— 
bounds 
2n 4n? \" _ 4 1 eure 

12 a <(———] < —4”, kK ==], 
(12) J (") < (#4) —~ 9 n! ~ (n+1)” 
which are valid for all values n > 2. 2... 02. c cee eee ee END OF EXAMPLE VIII.2. 


> VIII.4.Upward convexity of G(x)x~". For G(z) having nonnegative coefficients at the 


origin, the quantity G'(a)a~” is upward convex for x > 0, so that the saddle point equation for 
¢ can have at most one root. Indeed, the second derivative 


d? G(x) — #?G’(a) — 2nxG' (x) + n(n + 1)G(z) 
(13) —— er 
dx2 x gnt2 
is positive for x > 0 since its numerator, 
SU(n+1—k)(n—k) gee", ge = [z"]G(z), 


k>0 


has only nonnegative coefficients. (See Note IV.44, p. 266, for an alternative derivation.) < 


> VIILS5. A minor optimization. The bounds of Eq. (6), p. 512, which take the length of the 
contour into account, lead to estimates that closely resemble Eq. (10). Indeed, we have 


2 1G(z GO) C root o 72@) _, 
Z"|G(z)< se, ¢ ofS oe , 


when optimization is carried out over circles centred at the origin. dq 


VIII.3. Overview of the saddle point method 


Given a complex integral with a contour traversing a simple saddle point, the 
saddle point corresponds locally to a maximum of the integrand along the path. It is 
then natural to expect that a small neighbourhood of the saddle point might provide the 
dominant contribution to the integral. The saddle point method is applicable precisely 
when this is the case and when this dominant contribution can be estimated by means 
of local expansions. The method then constitutes the complex-analytic counterpart of 
the method of Laplace (APPENDIX B: Laplace’s method, p. 700) for the evaluation of 
real integrals depending on a large parameter, and we can regard it as being 


Saddle Point Method = Choice of Contour + Laplace’s Method. 


Like its real-variable counterpart, the saddle point method is a general strategy rather 
than a completely deterministic algorithm, since many choices are left open in the 
implementation of the method concerning details of the contour and choices of its 
splitting into pieces. 


To proceed, it is convenient to set F(z) = ef and consider 


B 
(14) | ef) dz, 


A 


where f(z) = fn(z), like F(z) = F,,(z), involves some large parameter n. Follow- 
ing possibly some preparation based on Cauchy’s theorem, we may assume that the 
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B 
Goal: Estimate / F(z) dz, setting F = e?. 
A 


— A contour C through a saddle point ¢ such that f’(¢) = 0 has been chosen. 
— The contour is split as C = C UCM. 
The following conditions are to be verified. 


SP: Tails pruning. On the contour C), the tails integral [ c(1) is negligible: 


ay PAde =o (/ F(z) az) 


SP2: Central approximation. Along CO, a quadratic expansion, 


fle) = FQ) + 5F"Ole- 9 + Om), 


is valid, with 7 — 0 as n — ov, uniformly with respect to z € CO, 
SPz3: Tails completion. The incomplete Gaussian integral taken over the central range is asymp- 
totically equivalent to a complete Gaussian integral (with « = +1): 


4 f"(6)(2-¢) if =|f"(Qle2/2 4, «| 20 
e2 dz~ et e dx = ci, / ——. 
hes ah If’ (I 


Result: Assuming SP1, SP2, and SPs, one has, with e = +1: 


1 B f(¢) 
x | ef ay wef 
A 


2in mF ON 


FIGURE VIII.4. The saddle point strategy. 


contour C connects two end points A and B lying in opposite valleys of the saddle 
point ¢. The saddle point equation is F’(¢) = 0, or equivalently since F = ef: 


f'(¢) =0. 


The saddle point method, of which a summary is given in Figure 4, is based on 
a fundamental splitting of the integration contour. We decompose C = C UC, 
where C ) called the “central part” contains ¢ (or passes very near to it) and C\) 
is formed of the two remaining “tails”. This splitting has to be determined in each 
case in accordance with the growth of the integrand. The basic principle rests on two 
major conditions: the contributions of the two tails should be asymptotically negligible 
(condition SP); in the central region, the quantity f(z) in the integrand should be 
asymptotically well approximated by a quadratic function (condition SP2). Under 
these conditions, the integral is asymptotically equivalent to an incomplete Gaussian 
integral. It then suffices to verify—this is condition SPs, usually a minor a posteriori 
technical verification—that tails can be completed back, introducing only negligible 
error terms. By this sequence of steps, the original integral is asymptotically reduced 
to a complete Gaussian integral, which evaluates in closed form. 


Specifically, the three steps of the saddle method involve checking conditions 
expressed by Equations (15), (16), and (18) below. 
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SP}: Tails pruning. On the contour C™), the tail integral Nee is negligible: 


(15) iz F(z) dz =0 (/ F(z) az) 


This condition is usually established by proving that F'(z) remains small enough (e.g., 
exponentially small in the scale of the problem) away from ¢, for z € C“), 
SP2: Central approximation. Along C°), a quadratic expansion, 


(16) fle) =F + sf" Ol - + Om), 


is valid, with 7, — 0 as n — oo, uniformly for z € C\). This guarantees that f e/ is 
well-approximated by an incomplete Gaussian integral: 


(17) | of) dew ef O ) StO@-O? gy. 
C(0) C(O) 
SPs: Tails completion. The tails can be completed back, at the expense of asymp- 
totically negligible terms, meaning that the incomplete Gaussian integral is asymptot- 
ically equivalent to a complete one, 


lp 2) ba 2/1 2 Qa 
(18) | e2f (6)(2—6) dz ~ei e FO" /2 de = ei, | ——_. 
Cc) 256 lf’) 


where € = +1 is determined by the orientation of the original contour C. This last step 
deserves a word of explanation. Along a steepest descent curve across ¢, the quantity 
f” (¢)(z—©)? is real and negative, as we saw when discussing saddle point landscapes. 
Indeed, if f”(¢) = e’®| f”’(¢)|, one has arg(z — ¢) = —@ (mod 7). Thus, the change 
of variables x = +i(z—(¢)e*®/? reduces the left side of (18) to an integral taken along 
(or close to) the real line. The condition (18) then demands that this integral can be 
completed to a complete Gaussian integral, which itself evaluates in closed form. 
If these conditions are granted, one has the chain 


| ee ame ‘) Pe eee (6) / AOC ges tjt@, | 2" 
(6 C(O) C(O) If"(¢) 


by virtue of Equations (15), (17), (18). In summary: 


Theorem VIIL3 (Saddle Point Algorithm). Consider an integral ies F(z) dz, where 
the integrand F = e/ is an analytic function depending on a large parameter and 
A, B lie in opposite valleys across a saddle point ¢, which is a root of the saddle point 
equation 


f(g) 7 0, 
or, equivalently, F''(¢) = 0. Assume that the contour C connecting A to B can be split 
into C = C© UC in such a way that the following conditions are satisfied: 
(i) tails are negligible, in the sense of Equation (15) of SP, 
(it) a central approximation hold, in the sense of Equation (16) of SP2, 
(itt) tails can be completed back, in the sense of Equation (18) of SPs. 
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Then one has, with « € {—1 + 1} reflecting orientation: 


1 Be 5, ef 
(19) eS ef @ dz ~ e——__., 
2im JA Vv 2a f"(C)| 
It can be verified at once that a blind application of the formula to the two integrals 
of Example 19 produces the correct asymptotic estimates 


2n 4” 1 1 
20 Jn = ~ and k, = -— ~~ ———_—. 
G9) (*") Vrn n! nre—V/27n 


The complete justification in the case of K,, is given in Example 3 below. The case 
of J, is treated by the general theory of “large powers” of Section VIII. 7, p. 547. 


In order for the saddle point method to work, conflicting requirements regard- 
ing the dimensioning of C) and C“) must be satisfied. The tails pruning and tails 
completion conditions, SP; and SPs, force C (°) to be chosen large enough, so as 
to capture the main contribution to the integral; the central approximation condition 
SP» requires C() to be small enough, to the effect that f(z) can be suitably reduced 
to its quadratic expansion. Usually, one has to take |C() |/||C|| = 0, and the following 
observation may help make the right choices. The error in the two-term expansion be- 
ing likely given by the next term, which involves a third derivative, it is a good guess 
to dimension C) so that it be of length 6 = 5(n) chosen in such a way that 


(21) I’(QPAw, Ff" (de, 


so that both tail and central approximation conditions can be satisfied. We call this 
choice the saddle point dimensioning heuristic. 

On another register, it often proves convenient to adopt integration paths that 
come close enough to the saddle point but need not pass exactly through it. In the same 
vein, a steepest descent curve may be followed only approximately. Such choices 
will still lead to valid conclusions, as long as the conditions of Theorem VIII.3 are 
verified. (Note carefully that these conditions neither impose that the contour should 
pass strictly through the saddle point, nor that a steepest descent curve should be 
exactly followed.) 


Saddle point method for Cauchy coefficient integrals. For the purposes of an- 
alytic combinatorics, the general saddle point method specializes. We are given a 
generating function G(z), assumed to be analytic at the origin and with nonnegative 
coefficients, and seek an asymptotic form of the coefficients, given in integral form by 


PNG@) = 52 f CO ar 


There, C encircles the origin, lies within the domain where G is analytic, and is posi- 
tively oriented. This is a particular case of the general integral (14) considered earlier, 
with the integrand being now F(z) = G(z)z~"71. 

The geometry of the problem is simple, and, for reasons seen in the previous 
section, it suffices to consider as integration contour a circle centred at the origin and 
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passing through (or very near) a saddle point present on the positive real line. It is 
then natural to make use of polar coordinates and set 


z=re?, 


where the radius r of the circle is chosen equal to (or close to) the saddle point value ¢. 

Under the circumstances, the basic split of the contour C = C) UC involves 
a central part C\°), which is an arc of the circle of radius r determined by |6| < 40 for 
some suitably chosen 69. On C), a quadratic approximation should hold, according 
to SP2 [central approximation]. On the rest C“) of the contour the function G(z) 
should be small in comparison to its value G(r), according to SP [tails pruning]. 
(Observe that |z~"~+| remains constant along any circle centred at the origin.) The 
choice of the angle 05 often turns out to be successful, when one follows the dimen- 
sioning heuristic of (21). In this range of problems, checking the condition SP [tail 
completion] is normally a mere formality. 


The example below details the main steps of the saddle point analysis of inverse 
factorials, based on the foregoing principles. 


EXAMPLE VIII.3. Saddle point analysis of the exponential and the inverse factorial. The goal 


is to estimate + = [z”]e”, the starting point being 
ee ee dz 
n— . ’ 
Qin Jizan 2" tt 


where integration is performed along a circle of radius r. The landscape of the modulus of the 
integrand has been already displayed in Figure 3, p. 515—there is a saddle point of G(z)z~"~* 
at C = n+ 1 with an axis perpendicular to the real line. We thus expect an asymptotic estimate 
to derive from adopting a circle passing though the saddle point, or about. In our treatment, we 
fix the choice r = n, by which calculations develop somewhat more smoothly. 

We switch to polar coordinates and set z = ne’. The original integral becomes, in polar 
coordinates, 

n +r : 

(22) re ere? -1- 49) ag 


n” In J, ; 


where, for readability, we have taken out the factor G(r) /r” = e” /n™. Set h(@) = e'® —1—10. 
The function |e” (9)| = e°°8 9-1 is unimodal with its peak at @ = 0 and the same property holds 
for jer(9)), representing the modulus of the integrand in (22), which gets more and more 
strongly peaked at 9 = 0, as n — +00; see Figure 5. 

In agreement with the saddle point strategy, the estimation of K,, proceeds by isolating a 
small portion of the contour, corresponding to z near the real axis. We thus introduce 


+00 27-09 
KO = / cdg, = KO) = 1 er do, 
—00 9% 


and choose o in accordance with the general heuristic of Equation (21), which instantiates to 
the two conditions n63 — oo, and n@3 — 0. One way of realizing the compromise is to adopt 
99 = n*, where a is any number between -$ and —}. We hence fix, rather arbitrarily, 


(23) 00 = Oo(n) =n?. 


In particular, the angle of the central region tends to zero. 
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FIGURE VIII.5. Plots of |e*z~"~"| for n = 3 and n = 30 (scaled according to the 
value of the saddle point) illustrate the essential concentration condition as higher values 
of n produce steeper saddle point paths. 


ncos 0 


(i) Tails pruning. For z = ne’? one has |e*| = e , and, by unimodality properties of 


the cosine, the tail integral kr ) satisfies 


=0(-*%) =0(0(-#")) 


for some C' > 0. The tail integral is thus is exponentially small. 
(ii) Central approximation. Near 9 = 0, one has h(@) = e*? —1—i0 = 30° + O(0"), 
so that, for |@| < 60, 


(24) Km 


ern) = eo ne? /2 (1 4+ O(n85)) : 


Since 09 = na 2/5. we have 
4n72/5 5 
(25) K© = e 7" /? ag (1 + O(n-¥/*)) 
—n—-2/5 
which, by the change of variables t = 0,/n, is conveniently rewritten as 
1 4+ni/10 
2 5 
(26) KO = af oe en dt (1 + O(n-¥/*)) 


The central integral is thus asymptotic to an incomplete Gaussian integral. 


(iit) Tails completion. Given (26), the task is now easy. We have, elementarily, for a > 0, 


+oo 
(27) / en? dt =O (e-°") 
which expresses the exponential smallness of Gaussian tails. As a consequence, 
ae hiner 2 
(28) KOnw — | ere a 
Jn Jo n 


Assembling (24) and (28), we obtain 


POUR (BE diy pete (Ko +K{?) nee 


n 27 n” nr orn 


The proof also provides a relative error term of O(n-/ 5), Stirling’s formula is thus seen to be 
(inter alia!) a consequence the saddle point method. ......... END OF EXAMPLE VIII.3. 
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Complete asymptotic expansions. Just like Laplace’s method, the saddle point 
method can be made to provide full asymptotic expansions. The idea is still to localize 
the main contribution in the central region, but now take into account corrections terms 
to the quadratic approximation. As an illustration of these general principles, we make 
explicit here the calculations relative to the inverse factorial. 

It suffices to revisit the estimation of K since K) is exponentially small. One 
first rewrites 


6 
K©) = i e778? /2.n(cos 0—1— 367) dé 
—0o 
1° «phot 1 
= —= eww /2—r&(w/ VP) ayy, £(6) := cosd — 1— <6". 
n —b0/n 2 


The calculation proceeds exactly in the same way as for the Laplace method (APPEN- 
DIX B: Laplace’s method, p. 700). It suffices to expand h(@) to any fixed order, which 
is legitimate in the central region. In this way, a representation of the form, 


1 O0V/n M-1 E 1 3M 
KO = — | ew? tit > KW) | © oe ) dw, 
Vn Jboyan eg nM/2 
is obtained, where the E(w) are computable polynomials of degree 3k. Distributing 


the integral operator over terms in the asymptotic expansion and completing the tails 
yields an expansion of the form 


1 M-1 F 
k a2 


+00 
where dg = V 27, dy i= ‘| ec /2 By (w) dw. 


All odd terms disappear by parity. The net result is then: 
Proposition VIII.1 (Stirling’s formula). The factorial numbers satisfy 
1 en” 1 1 139 571 
eS ee tf es 
n! \/2rn 12n 288n? 51840n? 2488320n 


Notice the amazing similarity with the form obtained directly for n! in APPEN- 
DIX B: Laplace’s method, p. 700. 


> VIIL.6. A factorial surprise. Why is it that the expansion of n! and 1/n! involve the same 
set of coefficients, up to sign? <q 


VIII. 4. Three combinatorial examples 


The saddle point method gives access to a number of asymptotic problems coming 
from analytic combinatorics. In this section, we further illustrate its use by treating in 
some detail* three combinatorial examples: 


3The purpose of these examples is to get some familiarity with the practice of the saddle point method 
in analytic combinatorics. The impatient reader can jump directly to the next section, where she will find a 
general theory that covers these and many more cases. 
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Involutions (TL), Set partitions (S), Fragmented permutations (F). 


These are all labelled structures introduced in Chapter II. Their specifications and 
EGFs are 


Involutions : T = SET(SET12(Z)) => I(z)= ezt2?/2 
(29) ¢ Set Partition : S = SET(SETs:(Z)) = S(z)=e% 7! 
Fragmented perms: F = SET(SEQs;(Z)) => F(z)= e2/(-2) | 


The first two are entire functions (i.e., they only have a singularity at co), while the 
last one has a singularity at z = 1. Each of these functions exhibits a fairly vio- 
lent growth—of an exponential type—near its positive singularity, at either a finite or 
infinite distance. As the reader will have noticed, all three combinatorial types are 
structurally characterized by a set construction applied to some simpler structure. 

Each example is treated, starting from the easier saddle point bounds and pro- 
ceeding with the saddle point method. The procedure, which follows the treatment of 
the function e*, is rather systematic. For Cauchy coefficient integrals, the saddle point 
is best carried out in polar coordinates, which calls for evaluating 


Qin gntl Qa 


—n +1 
(30) [z"|G(z) = : $e) dz _ Gir)r™ G(re®)e-"® ao. 


Under the trigonometrical form, it is seen that the best bound of type (6) is 


/ 
a), where ne) =n 
rn G(r) 
We shall also refer to the equation defining r as the saddle point equation. (The 
bound (31) is almost the same as the bound provided by Theorem VIII.2, which is 


G(¢)¢-"“1, where €G’(¢)/G(¢) =n + 1.) Setting 
(32) f(z) := log G(z) — nlog z 


(31) [2"]G(z) < 


we see that, locally, a quadratic approximation without linear terms holds, namely, 
with G(r) a computable quantity (in terms of f(r), f’(r), f”(r)) 


(33) f(re!) — f(r) = —F (06? + (6%), 


for fixed r (1.e., fixed n), as 0d > 0. 

It then suffices to choose a cutoff angle 69, then carry out a verification of the 
validity of the three conditions of the saddle point method, SP;, SP2, and SPs of 
Theorem VHI.3, p. 518, adjusted to take into account polar coordinate notations. The 
cutoff angle 99 is chosen as a function of n (or, equivalently, r) in accordance with the 
saddle point heuristic (21). 

The example of involutions treats a problem that is only a little more complicated 
than inverse factorials. The case of set partitions (Bell numbers) illustrates the need in 
general of a good asymptotic technology for implicitly defined saddle points. Finally, 
fragmented permutations, with their singularity at a finite distance, pave the way for 
the (harder) analysis of integer partitions in Section VIII. 6. 
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EXAMPLE VIII.4. Involutions. A permutation 7 such that 7? is the identity is an involution 
(p. 113). The corresponding EGF is I(z) = e*+*"/2_ We have in the notation of (32) 


2 
f(z) =z24+ = — nlog z, 


and the saddle point equation is 


r(l+r)=n, implying ra-p+5Vint ~ VR 5 + get O(n), 


The use of the saddle point bound (31) then gives mechanically 


Ls aye en /2+Vn 


GM, <2 Se ag OO) eS e 4 ./dnne ?/?+V"n-"/2 (1 + 0(1)). 


(Notice that if we use instead the approximate saddle point value, \/n, we only lose a factor 
e'/4 = 0.77880.) 

The cutoff point between the central and noncentral regions is determined, in agreement 
with (21), by the fact that the length 6 of the contour (in z coordinates) should satisfy f” (r)5? > 
oo and f’”(r)5° — 0. In terms of angles, this means that we should use a cutoff angle 09 that 
satisfies 


rf"(r)85 > 00, rf" (r)00 = 0. 
Here, we have f(r) = O(1) and f’”(r) = O(n~1/?), Thus, 00 should be chosen of an order 
somewhere in between n~!/? and n7!/ a: and we fix here 


Ao = no 2/8 

(2) Tails pruning. First, some general considerations are in order regarding the behaviour 
of |I(z)| along large circles, z = re”. One has 

log |I(re’®)| = r cos 6 + Fz 60s 20. 

5 , 
Thus, |/(z)| attains its maximum ( at r and its minimum (e~" /”) at z = ri. In the 
left half plane, first for 0 € (4, 2), the modulus |J(z)| is at most e” since cos 20 < 0. Finally, 
for @ € (32, a) smallness is granted by the fact that cos@ < —1/,/2 resulting in the bound 


As a function of 6, this function decreases on (0 
ert”? /2 ) 


) as it is the sum of two decreasing functions. 


|I(z)| < e” /2-r/V2_ The same argument applies to the lower half plane S(z) < 0. As 
a consequence of these bounds, [(z)/I(,/7) is strongly peaked at z = r; in particular, it is 
exponentially small away from the positive real axis, in the sense that 


i0 100 
(35) a =O (4) =O(exp(—n*)), 4 ¢ [-90, 0], 


for some a > 0. 
(ii) Central approximation. We then proceed and consider the central integral 


oe [ex (F(re)— s10) «0 


JO = 
Qn 265 


What is required is a Taylor expansion with remainder near the point r ~ ,/n. In the central 
region, the relations f/(r) = 0 f(r) = 2+ O(1/n), and f’”"(z) = O(n~\/*) yield 


f(re®) — f(r) = rile” a (n-“/?r369) = —r?6? + O(n“), 
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This is enough to guarantee that 


ef) +9 292 ‘ie 
(36) JO = —/ e” * do(1+0(n-*)). 
2m J_6 ( ( )) 


tii) Tails completion. Since r ~ \/n and 09 = n2/> we have 
(222) p 


+60 +0or si 
(37) / ere dd = ~{ got dt = a (/ er dt +O (or) : 
—00 T J—6or Jr a 


Finally, Equations (35), (36), and (37) give: 
Proposition VIII.2. The number I, of involutions satisfies 


(38) In _ oT agnietvi (14 (4) 
nm! Q/an nile? }° 
Comparing the saddle point bound (34) to the true asymptotic form (38), we see that the 
former is only off by a factor of O(n'/ ?). Here is a table further comparing the asymptotic 
estimate I provided by the right side of (38) to the exact value of I: 


n= 10 n = 100 n = 1000 
Tio = 9496 Tyo0 = 2.40533 - 10°" Tip00 = 2.14392 - 1077" 


Tip = 8839  Ifo9 = 2.34149 - 10°? — Ffg99 = 2.12473 - 1017, 

The relative error is empirically close to 0.3/./n, a fact that could be proved by developing a 
complete asymptotic expansion along the lines exposed in the previous section, p. 522. 

The estimate (38) of I, is given by Knuth in [302], whose derivation is carried out by 

means of the Laplace method applied to the binomial sum that expresses J,. Our derivation 

here follows Moser and Wyman’s in [367]. ................. END OF EXAMPLE VIII.4. 


EXAMPLE VII.5. Set partitions and Bell numbers. The number of partitions of a set of n 
elements defines the Bell number S;,, (p. 100) and one has 


Sy = nile" [2"|G(z) where G(z)=e°. 
The saddle point equation relative to G(z)z~"~' is 
Ces =n-+l1. 


This famous equation admits an asymptotic solution obtained by iteration (or “bootstrapping”): 
it suffices to write € = log(n+1)—log ¢, and iterate (say, starting from ¢ = 1), which provides 
the solution, 


2 
(39) ¢ =C(n) = logn — log logn + loglogn . o i 7) 


logn log? n 


(see [111, p. 26] for a detailed discussion). The corresponding saddle point bound reads 


The approximate solution C = log n provides in particular the simplified upper bound 


n—1 


which is enough to check that there are much fewer set partitions than permutations, the ratio 
being bounded from above by a quantity e~" 12 !een+O(n) | 
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In order to implement the saddle point strategy, integration will be carried out over a circle 
of radius r = ¢. We then set 


f(z) = log (2) =e* —(n+l)logz, 


and proceed to estimate the integral, 


> 
along the circle + of radius r. The usual saddle point heuristic suggests to define the range of 
the saddle point by a quantity 99 = 00(n) such that the quadratic terms in the expansion of h 
at r tend to infinity, while the cubic terms tend to zero. In order to carry out the calculations, 
it is convenient to express all quantities in terms of r alone, which is possible since n can be 
disposed of by means of the relation n + 1 = re”. We find: 


f'(ryse(ltr), f(r) =e"(1—2r’). 


Thus, 09 should be chosen such that re" 62 > Oo, re" 63 — 0, and the choice rfp = e 
is suitable. 


—2r/5 


(t) Tails pruning. First, observe that the function G(z) is strongly concentrated near the 
real axis since, with z = re’, there holds 


(40) \e*| = ere? € 


In particular G(re’’) is exponentially smaller than G(r) for any fixed 6 4 0, when r gets large. 


(tt) Central approximation. One then considers the central contribution, 


Oe dz 
Jn —_ iT fe G(z) gn+l? 


where ‘yo is the part of the circle z = re’® such that |Q.<O= e 2"/5y-1_ Since on Yo, the 
third derivative is uniformly O(e”), one has there 


(re) = f(r) — 5rP°PF"(r) + O(r°6°e"), 


This approximation can then be transported into the integral IS. 


(tit) Tails completion. Tails can be completed in the usual way. The net effect is the 


estimate 
ef (r) 


Varin) 


which, upon making the error term explicit rephrases, as follows. 


[z"|G(z) = (1 +O (r?6%e")) ; 


Proposition VIII.3. The number S,, of set partitions of size n satisfies 
eS —1 
(41) | (1 a o(e-*/*)) 
Cr y/2rC(C + 1)es 
where ¢ is defined implicitly by Ce’ = n +1, so that € = log n — log log n + o(1). 
Here is a numerical table of the exact values S;, compared to the main term S° of the 
approximation (41): 
n= 10 n = 100 n = 1000 
Sio = 115975 Stoo = 4.75853 - 10 S000 = 2.98990 - 10 
S%o = 114204 I fo9 = 4.75537- 101° — SSoq9 = 2.99012 - 10192". 
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The error is about 1.5% for n = 10, less than 10~° and 10~* for n = 100 and n = 1, 000. 

The asymptotic form in terms of ¢ itself is the proper one as no back substitution of an 
asymptotic expansion of ¢ (in terms of n and log n) can provide an asymptotic expansion for 
Sy solely in terms of n. Regarding explicit representations in terms of n, it is only log S;, that 
can be expanded as 


2 
~ log Si = logn — loglogn—1+ ewe + an +O ( (e282) : 
(Saddle point estimates of coefficient integrals often involve such implicitly defined quantities.) 
This example probably constitutes the most famous application of saddle point techniques 
to combinatorial enumeration. The first correct treatment by means of the saddle point method 
is due to Moser and Wyman [366]. It is used for instance by de Bruijn in [111, p. 104-108] as 
a lead example of the method. .......................00000- END OF EXAMPLE VIII.5. 


EXAMPLE VIII.6. Fragmented permutations. These correspond to F(z) = exp(z/(1— z)). 
The example now illustrates the case of a singularity at a finite distance. We set as usual 


f(2) = == — (nF Dlogz, 
and start with saddle point bounds. The saddle point equation is 
(42) a =n+1, 

Clee) 


so that ¢ comes close to the singularity at 1 as n gets large: 


2 — V4 1 1 = 
_ 2n+3 MEAS a hd ig 3/2). 
2n+2 vn 2n 


Here, the approximation C(n) = 1—1/,/n, leads to 
[2"]F(z) < 7 /7e?¥" (1 + 0(1)). 


The saddle point method is then applied with integration along a circle of radius r = ¢. 
The saddle point heuristic suggests to localize the integral to a small sector of angle 209 and 
since f(r) = O(n3/?) while f(r) = O(n”), this means taking 0 such that n°/409 — oo 
~7/19 is suitable. Concentration is easily 


¢ 


and n?/ 39, — 0. For instance, the choice 09 = n 
verified: we have 


el/—z) 


1—rcosé 
=e-e ——— 
zareid *P\ T—2rcosO +r)’ 
which is a unimodal function of 6 for 6 € (—7, 7). (The maximum of this function of 6 is of 


order exp((1—1)~*) attained at @ = 0; the minimum is O(1) attained at 9 = 7.) In particular, 
along the noncentral part |6| > 90 of the saddle point circle, one has 


= Ofex (v=), 


so that tails are exponentially small. Local expansions then enable us to justify the use of the 
general saddle point formula (Theorem VIII.3) in this case. The net result is: 


(43) eee 


z=ret9 


Proposition VIII.4. The number of fragmented permutations, F, = n![z"| F(z), satisfies 
Fn el 2—2vn 


44 Paige a 
as n! 2/mn3/4 
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The estimate (44) is only O(n-3/ a) off the corresponding saddle point bound. The relative 
error of the approximation is about 4%, 1%, 0.3% for n = 10, 100, 1000, respectively. 

The expansion above has been extended by E. Maitland Wright [505, 506] to several 
classes of functions with a singularity whose type is an exponential of a function of the form 
(1— z)~°. (For the case of (44), Wright [505] refers to an earlier article of Perron published in 
1914.) His interest was due, at least partly, to applications to generalized partition asymptotics, 
of which the basic cases are discussed in Section VIII. 6, p. 540. END OF EXAMPLE VIII.6. 


> VIIL.7. Wright’s expansions. Here is a special case. Consider the function 


= A 
_—(1_ B 
F(z) =(1- 2) or (a5), A>0, p>0O. 
Then, a saddle point analysis yields, e.g., when p < 1, 
B-1—p/2 p sa 
[2"|F(z) ~N Ae oy). N:= (4) . E 
2nAp(p + 1) Ap 


(The case p > 1 involves more terms of the asymptotic expansion of the saddle point.) The 
method generalizes to analytic and logarithmic multipliers, as well as to a sum of terms of the 
form A(1— z)~? inside the exponential. See [505, 506] for details. <q 


> VIIL8. Some oscillating coefficients. Define the function 


s(2)=sin (=), 


The coefficients s, = [z"]s(z) are seen to change sign at n = 6, 21, 46, 81, 125, 180,.... Do 
signs change infinitely many times? (Hint: Yes. there are two complex conjugate saddle points 
and their asymptotic form combine a growth of the form nteeV™ with an oscillating factor 
similar to sin \/n.) The sum 


exhibits similar fluctuations. J 


VIII. 5. Admissibility 


The saddle point method is a versatile approach to the analysis of coefficients 
of fast-growing generating functions, but one which is often cumbersome to apply 
step-by-step. Fortunately, it proves possible to encapsulate the conditions encoun- 
tered in the analysis of our previous examples into a general framework. This leads 
to the notion of an admissible function presented in Subsection VIII.5.1. By design, 
saddle point analysis applies to such functions and asymptotic forms for their coef- 
ficients can be systematically determined, following an approach initiated by Hay- 
man in 1956. A great merit of abstraction in this context is that admissible functions 
satisfy useful closure properties, so that an infinite class of admissible functions of 
relevance to combinatorial applications can be determined—we develop this theme 
in Subsection VIII. 5.2, relative to enumeration, and VIII. 5.3, relative to moments of 
combinatorial parameters. 
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VIII. 5.1. Admissibility theory. The notion of admissibility is in essence an ax- 
iomatization of the conditions underlying Theorem VIII.3 specialized to the case of 
Cauchy coefficient integrals. In this section, we base our discussion on H—admis- 
sibility, the prefix H being a token of Hayman’s original contribution [263]. A crisp 
account of the theory is given in Section II.7 of Wong’s book [502] and in Odlyzko’s 
authoritative survey [377, Sec. 12]. 

We consider here a function G(z) that is analytic at the origin and whose coef- 
ficients [z”|G(z), not necessary all nonnegative, are to be estimated. The switch to 
polar coordinates is natural, and the expansion of G(re’’) for small 6 should play a 
central réle. With r a positive real number lying within the disc of analyticity of G(z), 
the fundamental expansion is then 


Co 10 Vv 
(45) Beir) = Nea) Dea) or 


Not surprisingly, the most important quantities are the first two terms, and once G(z) 
has been put into exponential form, G(z) = e’(2), a simple computation yields 


(46) { a(r) = ay(r) = rh'(r) 


(r) := ac(r) = rh" (r)+rh'(r), with A(z) := log G(z). 
In terms of G, itself, one has 
MEE) 2 OY SOO. 3 fOOY 
(47) a) Gay: 0 la drares +r Gir) -fr (52) : 


Whenever G(z) has nonnegative Taylor coefficients at the origin, b(1) is positive for 
r > O and a(r) increases unboundedly as r — p. (This follows from the argument 
encountered in Note 4, p. 516.) 


Definition VIII.1 (Hayman—admissibility). Let G(z) have radius of convergence p 
with 0 < p < +00 and be always positive on some subinterval (Ro, p) of (0, p). The 
function G(z) is said to be admissible if it satisfies the following three conditions. 


Hy. [Capture condition] lim b(r) = +00. 
rp 


Hg. [Locality condition] For some function 09(r) defined over (Ro, p) and sat- 
isfying 0 < 9 < 7, one has 


G(re®) =e G(r) eta) (nr) /2 asr > Ro, 


uniformly in |0| < 00(r). 
H3. [Decay condition] Uniformly in 09(r) < |0| < 


G(re’”) =o ( a) . 


Admissible functions in the above sense are also called Hayman admissible or H- 
admissible. 


Note that the conditions in the definition are intrinsic to the function: they only 

make reference to the function’s values along circles, no parameter n being involved 
. . . p= 2 . . 

yet. It can be easily verified that the functions e*, e© ~!, and e**+* /? are admissible 
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with p = +00, and that the function e?/(~*) is admissible with p = 1. On the 
negative side, functions like e* and e* + e* are not admissible since they attain 
values that are too large when arg(z) is near 7. 

Coefficients of H—admissible functions can be systematically analysed to first 
asymptotic order, as expressed by the following theorem. The proof simply amounts 
to transcribing the definition of admissibility into the conditions of Theorem VIII.3. 


Theorem VIII.4 (Coefficients of admissible functions). Let G(z) be an H—-admissible 
function and ¢ = ¢(n) be the unique solution in the interval (Ro, p) of the the saddle 
point equation 


Gd) 
48 =n. 
(48) ¢ GO) n 
The Taylor coefficients of G(z) satisfy 
(49) Gn = [2"|G(z) ~ Oe as n — ©O 


Cr /27b(C) 
with b(z) = z7h!'(z) + zh'(z) and h(z) = log G(z). 
PROOF. Integration is carried out over a circle centred at the origin, of some radius r 
to be specified shortly. Under the change of variables z = re’®, the Cauchy coefficient 
formula becomes 


ron +r 


(50) Gn = [2"|G(z) = ae G(re’®)e— do. 
T Jax 

In order to obtain a quadratic approximation without a linear term, one chooses 

the radius of the circle as the positive solution ¢ of the equation a(¢) = n, that is, 
a solution of Equation (48). (Thus ¢ is a saddle point of G(z)z~”.) By the remarks 
accompanying (47), we have ¢ — p~ as n — +co. Following the general saddle 


point strategy, we decompose the integration domain and set 
+66 : ; 27-80 ; ; 
JO = G(ce*)e“" da, JM = i G(Ce Je“? do. 
—0o 90 


(i) Tails pruning. By the capture condition Hy and the decay condition H3, we 
have the trivial bound, which proves sufficient for our purposes: 


" jg ( £0 : 
a ( HO 


(ii) Central approximation. The uniformity of locality condition Hz implies 
+60 5 
(52) JO ~ F(C) if eW 8 46)/2 gg. 
—00 


(tit) Tails completion. A combination of the locality condition Hz and the decay 
condition Hg instantiated at 9 = 40, shows that b(¢)0? — +00 as n — +00. There 
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FIGURE VIIL.6. The families of Boltzmann distributions associated with involutions 
(G(z) = e***"/? with r = 4... 8) and set partitions (G(z) = e® ~! with r = 2..3) obey 
an approximate Gaussian profile. 


results that tails can be completed back, and 
(53) 


+0 +00/./6(0) oo 
; * ¢-0(7)87/2 ag —t / : eg / ent /2 dt. 
~ 69 Vo(r) J- 60/0 Vo(r) J-oo 


From (51), (52), and (53) (or equivalently via an application of Theorem VIII.3), 
the conclusion of the theorem follows. 


The usual comments regarding the choice of the function 09(r) apply. Consider- 
ing the expansion (45), we must have a2(r)03 — oo and a3(r)63 — 0. Thus, in order 
to succeed, the method necessitates a priori a3(r)” /a2(r)? — 0. Then, 0 should be 
taken according to the saddle point heuristic, 


1 
(54) —~ <I <« 


1/2 1/3? 
on! oil 


3 


a possible choice being the geometric mean of the two bounds 69(r) = ay tas)! 2 


The original proof by Hayman [263] contains in fact a general result that describes 
the shape of the individual terms g,,r” in the Taylor expansion of G(z) as r gets 
closer to its limit value p: it appears that the terms g,,r” exhibit a bell-shaped profile. 
Precisely, define a family of discrete random variables X (1) indexed by r € (0, R) as 
follows: 

Gnt” 

G(r) 

The model in which a random F structure with GF G(z) is drawn with its size being 
the random value X (7) is known as a Boltzmann model. Then: 


P(X(r) =n) = 


Proposition VIII.5. The Boltzmann probabilities associated to an admissible function 
G(z) satisfy, as — p~, the Gaussian estimate, 
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where the error term satisfies €,, = 0(1) as r — p uniformly with respect to integers 
n, 1.é., limp—p Sup, |€n| = 0. 
The proof is entirely similar to that of Theorem VIII.4; see Note 9. 


> VIIL.9. Admissibility and Boltzmann models. The Boltzmann distribution is accessible from 


1 27-809 ; ; 
Gn? = — G(re®)e~"”? dé. 
2m J_6, 


The estimation of this integral is once more based on a fundamental split 
+6 2Qr-8 
grr” = JO 4 J where == | i JY => / Z 
—00 


and 09 = 00(n) is as specified by the admissibility definition. Only the central approximation 
and tails completion deserve a adjustments. The “locality” condition H2 gives uniformly in n, 


+90 
JO = oo) i: etlacr)—n)O— 30(7) (4 4 (1) dO 
(56) mad 


+0 too 
_ Gr) lf : eilar)—n)O-$0(r)0? ag 4 (/ chow) : 
QT —00 —oo 


and setting (a(r) — n)(2/b(r))!/? = c, we obtain 
G(r) ees 
m/2b(r) |J—6o,/ory/2 


The integral in (57) can then be routinely extended to a complete Gaussian integral, introducing 
only o(1) error terms, 


(58) re a : i er eee o()| 


Sct 2 : : sa 
The Gaussian integral evaluates to \/7e~° /4 as is seen by completing the square and shifting 
vertically the integration line. dq 


(57) JO = ent tiet + o(1) 


> VII.10. Non-admissible functions. singularity analysis and H—admissibility conditions are 
in a sense complementary. Indeed, the function G(z) = (1 — z)~? fails to be be admissible 
1 net 


l-z Van’ 
corresponding to a saddle point near 1 — n~*. The explanation of the discrepancy is as follows: 
Expansion (45) has a, (1) of the order of (1—1r)~”, so that the locality condition and the decay 
condition cannot be simultaneously satisfied. 

Singularity analysis salvages the situation by using a larger part of the contour and by 
normalizing to a global Hankel Gamma integral instead of a more “local” Gaussian integral. 
This is also in accordance with the fact that the saddle point formula gives in the case of [z”](1— 
z) + a fraction 0.14676 of the true value, namely, 1. (More generally, functions of the form 
(1- z)8 are typical instances with too slow a growth to be admissible.) dq 


as the asymptotic form that Theorem VIII.4 would imply is the erroneous [z”] 


Closure properties. An important characteristic of Hayman’s work is that it leads 
to general theorems, which guarantee that large classes of functions are admissible. 


Theorem VIII.5 (Closure of H—admissible functions). Let G(z) and H(z) be admis- 
sible functions and let P(z) be a polynomial with real coefficients. Then: 


e (i) The product G(z)H(z) and the exponential e¢ are admissible func- 
tions. 
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e (ii) The sum G(z) + P(z) is admissible. If the leading coefficient of P(z) is 
positive then G(z)P(z) and P(G(z)) are admissible. 
e (iii) If the Taylor coefficients of e? ) are eventually positive, then eP() is 
admissible. 
PROOF. The easy proofs essentially reduce to making an inspired guess for the choice 
of the 6 function, which may be guided by Equation (54) in the usual way, and 
then routinely checking the conditions of the admissibility definition. For instance, 
in the case of the exponential, K(z) = e¢(), the conditions H,, H2, Hs of Defi- 
nition VIII.1 are satisfied if one takes 09(r) = (G(r))~?/°. We refer to Hayman’s 
original paper [263] for details. 


Exponentials of polynomials. The closure theorem also implies as a very special 
case that any GF of the form e? ) with P (z) a polynomial with positive coefficients 
can be subjected to saddle point analysis, a fact noted by Moser and Wyman [368, 
369]. 

Corollary VIIL.2 (Exponentials of polynomials). Let P(z) = )7i"., aj24 have non- 
negative coefficients and be aperiodic in the sense that gcd{j | a; # 0} = 1. Let 
f(z) = e?™. Then, one has 


1 eP() 


d\2 
aa where r= (rz) P(r), 


and r is a function of n given implicitly by r£P(r) =n. 


fn = [2"|fl2) ~ 


The computations are in this case purely mechanical, since they only involve the 
asymptotic expansion (with respect to n) of an algebraic equation. 

Granted the basic admissibility theorem and closures properties, many functions 
are immediately seen to be admissible, including 


z 2 
e”, ee be ert (2. 


which have previously served as lead examples for illustrating the saddle point method. 
Corollary VIII.2 also covers involutions, permutations of a fixed order in the symmet- 
ric group, permutations with cycles of bounded length, as well as set partitions with 
bounded block sizes: see Note 11 below. More generally, Call orollary VIII.2 applies 
to any labelled set construction, F = SET(G), when the sizes of G-components are 
restricted to a finite set, in which case one has 

FI” = Ser (U5_19;) ; = Fl (z) = exp S- GF 


j=1 


he j 
! 


This covers all sorts of graphs (plain or functional) whose connected components are 
of bounded size. 


> VIIL11. Applications of “exponentials of polynomials”. Corollary VIII.2 applies to the 
following combinatorial situations: 


Permutations of order p (o? = 1) f(z) =exp Xe in =) 
Permutations with longest cycle < p f(z) = exp (0% zo 


Partitions of sets with largest block <p f(z) =exp(>-? z 
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For instance, the number of solutions of o? = 1 in the symmetric group satisfies 


1/P) 


n\nQ-1/P) _aig 
ieee 


fie 
for any fixed prime p > 3 (Moser and Wyman [368, 369]). J 


Complete asymptotic expansions. Harris and Schoenfeld have introduced in [261] 
a technical condition of admissibility that is stronger than Hayman admissibility and 
is called HS-admissibility. Under such HS-admissibility, a complete asymptotic ex- 
pansion can be obtained. We omit the definition here due to its technical character but 
refer instead to the original paper [261] and to Odlyzko’s survey [377]. Odlyzko and 
Richmond [378] later showed that, if g(z) is H—admissible, then f(z) = e9) is HS- 
admissible. Thus, taking H—admissibility to mean at least exponential growth, full 
asymptotic expansions are to be systematically expected at double exponential growth 
and beyond. The principles of developing full asymptotic expansions are essentially 
the same as the ones explained on p. 522—only the discussion of the asymptotic scales 
involved becomes a bit technical at this level of generality. 


VIII. 5.2. Higher level structures and admissibility. The concept of admissi- 
bility and its surrounding properties (Theorems VIII.4 and VIII.5, Corollary VII.2) 
afford a neat discussion of which combinatorial classes should lead to counting se- 
quences that are amenable to the saddle point method. For simplicity, we restrict 
ourselves here to the labelled universe. 

Start from the first level structures, namely 


SEQ(Z), Cyc(Z), SET(Z), 
corresponding respectively to permutations, circular graphs, and urns, with EGFs 
1 
1-2’ 
The first two are of singularity analysis class; the last one resorts, as we saw, to the 
saddle point method and is H-admissible. 

Next consider second level structures defined by arbitrary composition of two 
constructions & o &’ applied to the atomic class Z, where & and &’ taken amongst 
the three constructions SEQ, CYC, SET; see Subsection II. 4.2, p. 115 for a discussion 
(In the case of the internal construction &’ it is understood that, for definiteness, the 


number of components is constrained to be > 1.) There are three structures whose 
external construction is of the sequence type, namely, 


SEQCSEQ, SEQoCyYC, SEQoSET, 


corresponding respectively to “labelled compositions”, alignments, and surjections. 
All three have a dominant singularity that is a pole; hence they are amenable to mero- 
morphic coefficient asymptotics (Chapters IV and V), or, with resulting weaker esti- 
mates, to singularity analysis. 
Similarly there are three structures whose external construction is of the cycle 
type, namely, 
CYCoSEQ, CycoCyc, CyYCoSET, 
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corresponding to cyclic versions of the previous ones. In that case, the EGFs have a 
logarithmic singularity; hence they are amenable to singularity analysis (Chapters VI 
and VI), or, after differentiation, to meromorphic coefficient asymptotics again. 

The case of an external set construction is of interest. It gives rise to 


SETOSEQ, SEToCyYc, SETOSET, 


corresponding respectively to fragmented permutations, usual permutations, and set 
partitions. The composition SET o CYC appears to be special, because of the general 
isomorphism, valid for any class C, 


SET(CYC(C)) © SEQ(C), 


corresponding to the unicity of the decomposition of a permutation of C-objects into 
cycles. Accordingly, at generating function level, an exponential singularity “simpli- 
fies”, when combined with a logarithmic singularity, giving rise to an algebraic (here 
polar) singularity. The remaining two cases, namely, fragmented permutations and set 
partitions, characteristically come under the saddle point method and admissibility, as 
we have seen already. 

Closure properties then make it possible to consider structures defined by an arbi- 
trary nesting of the constructions in {SEQ, CYC, SET}. For instance, “superpartitions” 
defined by 

S = SET(SETs 1(SETs1(Z))), => S(z) =e" 73, 
are third level structures. They can be subjected to admissibility theory and saddle 
point estimates apply a priori. Notes 13 and 14 further discuss such third level struc- 
tures. 


> VIII.12. Idempotent mappings. Consider functions from a finite set to itself (“mappings” 
or “functional graphs” in the terminology of Chapter ID) that are idempotent, ie., dod = 
@. The EGF is I(z) = exp(ze*) since cycles are constrained to have length 1 exactly. The 
function I(z) is admissible and 


n! = 
re m (MDC). 
V2TNG ¢ 
where ¢ is the positive solution of ¢(¢ + 1)e$ = n+ 1. This example is discussed by Harris 
and Schoenfeld in [261]. J 


> VIIIL.13. The number of societies. A society on n distinguished individuals is defined by 
Sloane and Wieder [441] as follows: first partition the n individuals into nonempty subsets and 
then form an ordered set partition [preferential arrangement] into each subset. The class of 
societies is thus a “third level’ (labelled) structure, with specification and EGF 


S = SET (SEQs,(SET>1(Z))) —= — S(z) = exp (5 — 7 1) 


The counting sequence starts as 1, 1, 4, 23, 173, 1602 (EIS 75729); asymptotically 


eV 27/ log 2 


ee Cae (log 2)””’ 


for some computable C’. (The singularity is of the type “exponential-of-pole” at log 2.) J 
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> VII.14. Third level classes. Consider labelled classes defined from atoms (Z) by three 
nested constructions of the form Ro RL, o RY,, where each R, f’, RK” is either a sequence 
(abbreviated G) or a set (38) construction. All cases can be analysed, either by saddle point 
and admissibility (SP) or by singularity analysis (SA). Here is a table recapitulating structures, 
together with their EGF, radius of convergence (p), and analytic type. 


BPP "1 p=co (SP) SPP Wy p=1+loglog2 (SA) 
e2/(1-2) _ 5 
PPS * p=l (SP) 686 seq Po ot (SA) 
BSP exp(s4>) p=log2 (SP) || Gop = p= log 3 (SA) 
P66. 8/0) p=5 (SP) S666. Ee pas (SA) 
The outermost construction dictates the analytic type and precise asymptotic equivalents can be 
developed in all cases. <q 


VIII. 5.3. Moment analyses. Univariate applications of admissibility include 
the analysis of generating functions relative to moments of distributions, which are 
obtained by differentiation and specialization of corresponding multivariate generat- 
ing functions. In the context of saddle point analyses, the dominant asymptotic form 
of the mean value as well as bounds on the variance usually result, often leading to 
concentration of distribution (convergence in probability) properties. 

The situation of interest here is that of a counting generating function G(z), cor- 
responding to a class G, which is amenable to the saddle point method. A parameter y 
on G gives rise to a bivariate GF G(z, u), which is a deformation of G(z, u) when u 
is close to 1. Then the GFs 


relative to successive (factorial) moments, in many cases, amenable to an analysis 
that closely resembles that of G(z) itself. In this way, moments can be estimated 
asymptotically. 

We illustrate the analysis of moments by means of two examples: 


u=l? Sie? 


— Example 7 provides an analysis of the mean number of blocks in a random 
set partition by means of bivariate generating functions. 

— Example 8 estimates the mean number of increasing subsequences in a ran- 
dom permutation by means of a direct generating function construction. 


EXAMPLE VIII.7. Blocks in random set partitions. The function 
G(z,u) = ewe -D 
is the bivariate generating function of set partitions with u marking the number of blocks (or 
parts). We set G(z) = G(z, 1) and define 
= O _ pe*+z-1 
M(z) = ——G(z, u) =e : 
UL 
Thus, the quantity 
mn _ [2"|M(z) 
gn [2"|G(z) 


represents the mean number of parts in a random partition of [1..n]. We already know that 
G(z) is admissible and so is M/(z) by closure properties. The saddle point for the coefficient 
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integral of G(z) occurs at ¢ such that eS = n, and it is already known that ¢ = logn — 
log log n + o(1). 

It would be possible to analyze M(z) by means of Theorem VIII.4 directly: the analysis 
then involves a saddle point ¢; 4 ¢ that is relative to M(z). An analysis of the mean would 
then follow, albeit at some computational effort. It is however more transparent to appeal to 
Lemma VIIL.5 and analyse the coefficients of M(z) at the saddle point of G(z). 

Let a(r),b(r) and a1(r), bi(r) be the functions of Eq. (46) relative to G(z) and M(z) 
respectively: 


logG(z) = e*-1 log M(z) = e*+2z-1 
a(r) = re” ai(r) = re'+r=a(r)+r 
b(r) = (r?4r)e" bi(r) = (r4rje"+r=0(r)4+r. 


Thus, estimating m,, by Lemma VIII.5 with the formula taken at r = ¢, one finds 
¢ 2 
Mn = GQ) lexp (- G ) +0(1)| 
/27b1(¢) bi(C) 
while the corresponding estimate for g,, is 


on = 
V 2nd) 


Given that b:(¢) ~ b(¢) and that C? is of smaller order than bi (¢), one has 


Mn 


[1 + 0(1)]. 


8G tg te 
Be = ell +01) = Ee 


(1+ 0(1)). 

A similar computation applies to the second moment of the number of parts which is 
found to be asymptotic to e?¢ (the computation involves taking a second derivative). Thus, the 
standard deviation of the number of parts is of an order o(e$) that is smaller than the mean. 
This implies a concentration property for the distribution of the number of parts. 


Proposition VIII.6. The variable X,, equal to the number of parts in a random partition of the 
set [1..n] has expectation 


E{X, b= ogn tt + (0). 


The distribution satisfies a “concentration” property: for any € > 0, one has 


Xn 
P{eAty ><} 0 asn — +00. 


The calculations are not difficult (see Note 15 for details) but they require care in the 
manipulation of asymptotic expansions: for instance, Salvy and Shackell [426] who “do it 
right” report that two discrepant estimates (differing by a factor of e~+) had been previously 
published regarding the value of the mean. .................. END OF EXAMPLE VIII.7. 


VIIL15. Higher moments of the number of blocks in set partitions. Let Xy be the number 
of blocks in a random partition of n elements. Then, one has 


E(Xn) 


which proves concentration. The calculation is best performed in terms of the saddle point ¢, 
then converted in terms of n. [See Salvy’s étude [425] and [426].] J 


net A 3: n log log n (1 + o(1)) 
~ logn log? n 


? 


| VX) = a ga loeleem Leo) 
log“ n log* n 
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> VIIL.16. The shape of random involutions. Consider a random involution of size n, the EGF 


2427/2 


of involutions being e . Then the mean number of 1-cycles and 2-cycles satisfy 


1 
E(# I-cycles) = /n + O(1), E(# 2-cycles) = gn J/n+ O(1). 
In addition, the corresponding distributions are concentrated. <q 


EXAMPLE VIII.8. Increasing subsequences in permutations. Given a permutation written in 
linear notation as 0 = 01 --- On, an increasing subsequence is a subsequence 0}, --- 0%, which 
is in increasing order, i.e., 71 <--- <%, and oi, < ---0%,. The question asked is: What is the 
mean number of increasing subsequences in a random permutation? 

The problem has a flavour analogous to that of “hidden” patterns in random words, which 
was tackled in Chapter V, p. 292, and indeed similar methods are applicable here. Define a 
tagged permutation as a permutation together with one of its increasing subsequence distin- 
guished. (We also consider the null subsequence as an increasing subsequence.) For instance, 


7|352|641|89 


is a tagged permutation with the increasing subsequence 3 6 8 that is distinguished. The vertical 
bars are used to identify the tagged elements, but they may also be interpreted as decomposing 
the permutation into subpermutation fragments. We let T be the class of tagged permutations, 
T(z) be the corresponding EGF, and set T;, = n![z”|T'(z). The mean number of increasing 
subsequences in a random permutation of size n is clearly t, = T,/nl. 

In order to enumerate J, we let P be the class of all permutations and Pt the subclass of 
non empty permutations. Then, one has up to isomorphism, 


T =P xSEt(P*), 


since a tagged permutation can be reconstructed from its initial fragment and the set of its 
fragments (by ordering the set according to increasing values of initial elements). This combi- 
natorial argument gives the EGF T(z) as 


T(z) = —e () 


The generating function T(z) can be expanded, so that the quantity T,, admits a closed 
form, 


From there it is possible to analyse T;, asymptotically by means of the Laplace method for 
sums, as was done by Lifschitz and Pittel in [332]. However, analytically, the function T(z) is 
a mere variant of the EGF of fragmented permutations. Saddle point conditions are again easily 
checked, either directly or via admissibility, to the effect that 


Tn eV 2e2vr 
ml 2/ani/4 
(Compare with the closely related estimate (44) on p. 527.) 
The estimate (59) has the great advantage of providing information about an important and 
much less accessible parameter. Indeed, let A(o), represent the length of the longest increasing 


subsequence in o With 1(c) is the number of increasing subsequences, one has the general 
inequality, 


(59) tn = 


2) < uo), 
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since the number of increasing subsequences of o is at least as large as the number of subse- 
quences contained in the /ongest increasing subsequence. Let now £,, be the expectation of 
over permutations of size n. Then, convexity of the function 2” implies 


Vn(1 +0(1)). 


(60) 2" < tn, sothat l, < 
log 2 


In summary: 


Proposition VIII.7.. The mean number of increasing subsequences in a random permutation 
of n elements is asymptotically 


on N2—2vm 
2/mni/4 


Accordingly, the expected length of the longest increasing subsequence in a random permutation 
of size n satisfies the inequality 


(1+ 0(1)). 


The upper bound obtained on the expected length /, of the longest increasing sequence 
is of the form 2.89,/n while Note 19 describes a lower bound of the form ¢, > t/n. In 
fact, Logan and Shepp [336] independently of Vershik and Kerov [485] have succeeded in 
establishing the much more difficult result 


Ln ~ Q/n. 


Their proof is based on a detailed analysis of the profile of arandom Young tableau. (The bound 
obtained here by a simple mixture of saddle point estimates and combinatorial approximations 
at least provides the right order of magnitude.) This has led in turn to attempts at characterizing 
the asymptotic distribution of the length of the longest increasing subsequence. The problem 
remained unsolved for two decades, despite many tangible progresses. J. Baik, P. A. Deift, 
and K. Johansson [19] eventually obtained a solution (in a publication dated 1999) by relating 
longest increasing subsequences to eigenvalues of random matrix ensembles. We regretfully 
redirect the reader to relevant presentations of the beautiful theory surrounding this sensational 
result, for instance [7, 114]. ......... ccc cece eee END OF EXAMPLE VIII.8. 


> VII.17. A useful recurrence. A decomposition according to the location of n yields for tn 
in (60) the recurrence 


n—-1 
i 
tn =tn1+t— , =1. 
it A os tk to 1 
k=0 
Hence T'(z) satisfies the ordinary differential equation, 


(1-2 Sr(@)=(2-2)7@), TO) =1, 


which can be solved explicitly. Also the differential equation gives rise to the recurrence 


n 
tn41 = 2tn — pinot to = 0, ti= 2, 


by which t,, can be computed efficiently in a linear number of operations. <q 


> VIII.18. Related combinatorics. The sequence of values of T;, starts as 1, 2, 7, 34, 209, 1546, 
and is EIS A002720. It counts the following equivalent objects: (i) the n x n binary matrices 
with at most one entry 1 in each column; (iz) the partial matchings of the complete bipartite 
graph Kn»; (ti) the injective partial mappings of [1 . . n] to itself. <q 
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> VIII.19. A simple probabilistic lower bound. Elementary probability theory provides a sim- 


ple lower bound on @,,. Let X1,..., Xn be independent random variables uniformly distributed 
over (0, 1]. Assume n = m?. Partition [0, 1[ into m subintervals each of the form (2+, + 
and X1,..., Xn into m blocks, each of the form X(x~1)m41,-++;Xkm.- There is a probability 


1—(1—m7~!)™ ~ 1—e7! that block numbered 1 contains an element of subinterval numbered 
1, block numbered 2 contains an element of subinterval numbered 2, and so on. Then, with high 
probability, at least 3 of the blocks contain an element in their matching subinterval. Conse- 
quently, @,, > s/n, for n large enough. (The factor $ can even be improved a little.) The 
crisp booklet by Steele [451] describes many similar as well as more advanced applications to 
combinatorial optimization. See also the book of Motwani and Raghavan [370] for applications 
to randomized algorithms in computer science. 


> VII.20. The Baik—Deift-Johansson Theorem. Consider the Painlevé II equation u(x) = 
2u(x)? + xu(x), and the particular solution uo (a) that is asymptotic to — Ai(x) as 2 — +00, 
with Ai(z) the Airy function, which solves y’ — xy = 0. Define the Tracy-Widom distribution 
(arising in random matrix theory) 


F(t) = exp (fe — t)uo(x)? az) 


The distribution of the length of the longest increasing subsequence, L,, satisfies 
lim P (in <2/n+ tn/°) = F(t), 


for any fixed t. Thus the discrete random variable L,, converges to a well-characterized distri- 
bution [19]. (An exact formula for associated GFs is due to Gessel; see p. 698.) J 


VIII. 6. Integer partitions 


We examine the asymptotic enumeration of partitions, where the saddle point 
method serves as the main asymptotic engine. The corresponding generating function 
enjoys rich properties, and the analysis, which goes back to Hardy and Ramanujan in 
1917, constitutes, as we pointed out in the introduction, a jewel of classical analysis. 

Integer partitions represent additive decompositions of integers, when the order 
of summands is not taken into account. When all summands are allowed, the specifi- 
cation and ordinary generating function are (Section I. 3, p. 37) 


(61) P = MSET(SEQs\(2)) mes P(z) = |] 1 os 


m=1 


which, by the exp-log transformation admits the equivalent form 


P(z) = exp a log(1 — 2™)~* 
m=1 


z g 2? 43 2° 
— ex — —__—- —_——— z 
y 1l—-z 21-22 31-23 


From either of these two forms, it can be seen that the unit circle is natural boundary, 
beyond which the function cannot be continued. The second form, which involves the 
quantity exp(z/(1 — z)) is reminiscent of the EGF of fragmented permutations, ex- 
amined in Example 6, p. 527, to which the saddle point method could be successfully 
applied. 


(62) 
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Summands Specification Asymptotics 


All, Zs1 MSET(SEQs ; (Z)) ; ae i Ex. 2, p. 541 
= 0) 
All distinct, Zs PSET(SEQs  (Z)) ae Not 3s: 6: S46 
= > aly 
Squares, 1,4,9,16,---5  — Cn-/6eK"' | Note 25, p. 546 
Primes, 2,3,5,7,... oe log P™ ~ c,/—~ | Note 27, p. 546 
logn 
2 
Powers of two, 1,2,4,... — log Man ~ Cen Note 28, p. 547 
Plane — cn 25/36 pcan Note 26, p. 546 


FIGURE VIII.7. Asymptotic enumeration of various types of partitions. 


In what follows, we show (Example 9) that the saddle pont method is applicable, 
though the analysis of P(z) near the unit circle is delicate (and pregnant with deep 
properties). The accompanying notes point to similar methods being applicable to a 
variety of similar-looking generating functions, including those relative to partitions 
into primes, squares, and distinct summands, as well as plane partitions: see Figure 7 
for a summary of some of the asymptotic results known. 


EXAMPLE VIII.9. Integer partitions. We are dealing here with a famous chapter of both as- 
ymptotic combinatorics and additive number theory. A problem similar to that of asymptotically 
enumerating partitions was first raised by Ramanujan in a letter to Hardy in 1913, and subse- 
quently developed in a famous joint work of Hardy and Ramanujan (see the account in Hardy’s 
Lectures [260]). The Hardy-Ramanujan expansion was later perfected by Rademacher [18] 
who, in a sense, gave an “exact” formula for the partition numbers P,,. 

A complete derivation with all details would consume more space than what we can devote 
to this questions. We outline here the proof strategy in such a way that, hopefully, the reader 
can supply the missing details by herself. (The cited references provide a complete treatment). 

Like before, we start with simple saddle point bounds. Let P, denote the number of 
integer partitions of n, with OGF as stated in (61). A form amenable to bounds derives from 
the exp—log reorganization (62), which yields 


reee (2s): G+ms tae )) 


The denominator of the general term in the exponential satisfies, for x € (0,1), mx 
L+aot---+2™7! <m,>s0 that 


1 1 
(63) oS < og P(e) < Se 


m-1 < 


(64) P(x) = exp ae 
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given the elementary identity }> m? = 7 /6. The singularity type at z = 1 resembles that 
of fragmented permutations (p. 527), and at least the growth along the real axis is similar. An 
approximate saddle point is then 


(65) ¢(n) =1-—e 


which gives a saddle point bound 


Pn < exp (Ky/n(1 + 0(1)) , K=m/2. 


Proceeding further involves transforming the saddle point bounds into a complete sad- 
dle point analysis. Based on previous experience, we shall integrate along a circle of radius 
r= (n). To do so, two ingredients are needed: (i) an approximation in the central range; 
(ii) bounds establishing that the function P(z) is small away from the central range so that 
tails can be first neglected, then completed back. Assuming the expansion (63) to lift to an area 
of the complex plane near the real axis, the range of the saddle point should be analogous to 
what was found already for exp(z/(1—z)), so that 0) = n~*/1° will be adopted. Accordingly, 
we choose to integrate along a circle of radius r = C(n) given by (65) and define the central 
region by 0) = n~"/!°. Under these conditions, the central region is seen under an angle that 
is O(n—"/5) from the point z = 1. 

(t) Central approximation. This requires a refinement of (63) till o(1) terms as well as an 
argument establishing a lifting to a region near the real axis. We set z = e ° and start with 
t > 0. The function 


e7m 


L(t) := log P(e‘) = oe ml aes") 


m>1 
is a harmonic sum which is amenable to Mellin transform techniques (as described in APPEN- 
DIX B: Mellin transform, p. 707). The base function is e~'/(1 — e~‘), the amplitudes are 
the coefficients 1/m and the frequencies are the quantities m figuring in the exponents. The 
Mellin transform of the base function, as given in the appendix, is I'(s)¢(s). The Dirichlet 
series associated to the amplitude frequency pairs is ‘> m~'m~* = ¢(s + 1), so that 


L*(s) = ¢(s)¢(s + L)I(s). 
Thus L(t) is amenable to Mellin asymptotics and one finds 


2 


te Tv 1 1 2 + 
(66) L(t) = es + 5 lost log V27 at + Olt ), t-0",7 


from the poles of L*(s) at s = 1,0, —1. This corresponds to an improved form of (64): 


a2 


6(1 — z) 
At this stage, we make a crucial observation: The precise estimate (66) extends when t lies 
in any sector symmetric about the real axis, situated in the half-plane R(t) > 0, and with 
an opening angle of the form m — 6 for an arbitrary 6 > 0. This derives from the fact that 
the Mellin inversion integral and the companion residue calculations giving rise to (66) extend 
to the complex realm as long as | Arg(t)| < 4 — 46. (See the appendix on Mellin or the 
article [184].) Thus, the expansion (67) holds throughout the central region given our choice 
of the angle 09. The analysis in the central region is then practically isomorphic to the one of 


exp(z/(1 — z)) in the previous example, and it presents no special difficulty. 


2 


(67) log P(z) = + 5 log(1 — 2) - 5 ~ log V2 + O(1 — 2). 


(ii) Bounds in the noncentral region. This is here a nontrivial task since half of the factors 
entering the product form (61) of P(z) are infinite at z = —1, one third are infinite at z = 
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FIGURE VIII.8. Integer partitions. (Left) The surface | P(z)| with P(z) the OGF of 
integer partitions. The plot shows the major singularity at z = 1 and smaller peaks cor- 
responding to singularities at z = —1, e*?'7/3 and other roots of unity. (Right) A plot 
of P(re’’) for varying 0 and r = 0.5,...,0.75 illustrates the increasing concentration 
property of P(z) near the real axis. 


e*?'*/3 and so on. Accordingly, the landscape of |P(z)| along a circle of radius r that tends 
to 1 is quite chaotic: see Figure 8 for a rendering. It is possible to extend the analysis of 
log P(z) near the real axis by way of the Mellin transform to the case z = e~*~'* as t > 0 
and ¢ = ant is commensurate to 27. In that case, one must operate with 


en m(t+id) 


1 il —™m 4 
i=), 7 oe 


m>1 m>1k>1 


which is yet another harmonic sum. The net result is that when |z| tends radially towards o @, 
then P(z) behaves roughly like 


1 
— = (carci = sw) 


which is a power 1/ q¢° of the exponential growth as z — 1~'. This analysis extends next to 
a small arc. Finally, consider a complete covering of the circle by arcs whose centres are of 
argument and, j =1,...,N — 1, with N chosen large enough. A uniform version of the 
bound (68) makes it possible to bound the contribution of the noncentral region and prove it 
to be exponentially small. There are several technical details to be filled in order to justify this 
approach, so that we switch to a more synthetic one based on transformation properties of P(z), 
following [10, 13, 18, 260]. (Such properties also enter the Hardy-Ramanujan-Rademacher 
formula for P,, in an essential way.) 
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The fundamental identity satisfied by P(z) reads 


(69) Ble )2 (7 exp (4 (+ - r)) Rey, 


12\7 


which is valid when Jt(7) > 0. The proof is a simple rephrasing of a transformation formula of 
Dedekind’s 7 (eta) function, summarized in Note 21 below. 
> VIII.21. Modular transformation for the Dedekind eta function. Consider 


nr) =a [[a-9"), g=e"™, 
m=1 


with S(7) > 0. Then 7(7) satisfies the “modular transformation” formula, 


(70) n (-+) <n(r). 


T a 


This transformation property is first proved when 7 is purely imaginary 7 = it, then extended 
by analytic continuation. Its logarithmic form results from a residue evaluation of the integral 


s ds 
— cot 7s cot 7m— —, 
ant J. T 8 
with + a large contour avoiding poles. (This elementary derivation is due to C. L. Siegel. The 
function 7(7) satisfies transformation formule under S : 7 + 7+1 and T : 7 + —1/7, which 
generate the group of modular (in fact “unimodular”) transformations 7 +> (at + b)/(cr + d) 
with ad — bc = 1. Such functions are called modular forms.) 

Given (69), the behaviour of P(z) away from the positive real axis and near the unit 
circle can now be quantified. Here, we content ourselves with a representative special case, the 
situation when z — —1. Consider thus P(z) with z = en erttia, where, for our purposes, we 
may take t = —+—. Then, Equation (69) relates P(z) to P(z’), with r = t — i/2 and 


V24n° 
po anf __ ( ant ) ip = T 
Zz é exp aT TI ec: o= S51: 
e+e e+e 


Thus |z’| + 1 as t + 0 with the important condition that |z’| — 1 = O((|z| — 1)'/“). In other 
words, 2’ has moved away from the unit circle. Thus, since |P(z’)| < P(|z’|), we may apply 
the estimate (67) to P(|z’|) to the effect that 
T 

log |P(z)| < —————-(1 + o(1 —1*). 

oBIPOMS tel), (2-1) 
This is an instance of what was announced in (68) and is in agreement with the surface plot of 
Figure 8. The extension to an arbitrary angle presents no major difficulty. 


The two properties developed in (i) and (ii) above guarantee that the approximation (67) 
can be used and that tails can be completed. We find accordingly that 


=n? /12 2 
e T 
Py ~ [2"]| ——— exp | ———~ } . 
Sr res (gs) 
All computations done, this provides: 


Proposition VIII.8. The number p, of partitions of integer n satisfies 


=p 7 1 1 Ty/2n/3 
(71) pee aoa e"v 


AnV/3 


The singular behaviour along and near the real line is comparable to that of exp((1—z)~'), 
which explains a growth of the form BO Ae Snctiethum irene s tas END OF EXAMPLE VIIL.9. 
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The asymptotic formula (71) is only the first term of a complete expansion involv- 
ing decreasing exponentials that was discovered by Hardy and Ramanujan in 1917 and 
later perfected by Rademacher (see Note 23 below). While the full Hardy—Ramanujan 
expansion necessitates considering infinitely many saddle—points near the unit circle 
and requires the modular transformation of Note 21, the main term of (71) only re- 
quires the asymptotic expansion of the partition generating function near z = 1. 

The principles underlying the partition example have been made into a general 
method by Meinardus [355] in 1954. Meinardus’ method abstracts the essential fea- 
tures of the proof and singles out sufficient conditions under which the analysis of an 
infinite product generating function can be achieved. The conditions, in agreement 
with the Mellin treatment of harmonic sums, requires analytic continuation of the 
Dirichlet series involved in log P(z) (or its analogue), as well as smallness towards 
infinity of that same Dirichlet series. A summary of Meinardus’ method constitutes 
Chapter 6 of Andrews treatise on partitions [10] to which the reader is referred. The 
method applies to many cases where the summands and their multiplicities have a 
regular enough arithmetic structure. 
> VIII.22. A simple yet powerful formula. Define (cf [260, p. 118]) 


5 ee Ie 2 ada v2 Pik 
Pea (<*), K=T7 3? An I= a 1° 


Then Pe approximates P, with a relative precision of order e~°Y™ for some c > 0. For 
instance, the error is less than 3- 10~° for n = 1000. [Hint: The transformation formula makes 
it possible to evaluate the central part of the integral very precisely. ] dq 


> VIII.23. The Hardy—Ramanujan—Rademacher expansion. The number of integer partitions 
satisfies the exact formula 


where A;(n) = > Dhaene 


h mod k,gced(h,k)=1 


k-1 
Wh, is a 24th root of unity, wr,, = exp(mis(h,k)), and spn = eaeH qmey is known 
pal 


as a Dedekind sum, with {{x}} = « — |x| — 4. Proofs are found in [10, 13, 18, 260]. <q 
> VIII.24. Meinardus’ theorem. Consider the infinite product (an > 0) 
fz) =][a-2). 
n=1 
The associated Dirichlet series is a(s) = “n Assume that a(s) is continuable into a 
ns 
n>1 


meromorphic function to R(s) > —Co for some Co > 0, with only a simple pole at some 
p > 0 and corresponding residue A; assume also that a(s) is of moderate growth in the half- 
plane, namely, a(s) = O(|s|@"), for some C1 > 0 (as |s| — 00 in R(s) > —Co). Let 
9(z) = 0,51 Gnz” and assume a concentration condition of the form 


Rg (eT) — gle“) < —Cay™. 
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Then the coefficient fn = [z"]f(<z) satisfies 


fr = Cn" exp Gane) ,  K=(14p7)[AT(ot De(p+ DOr. 
The constants C’, « are: 
(0) —1/2 1—2a(0))/(2p+2 a(0)—1—$p 
C = e*  (2n(1 + p)) 7? [AP (9 + 1)¢(e + 1] O 72 OMCP |g = eee 3 ae 
Details of the concentration condition, and error terms are found in [10, Ch 6]. J 


> VIII.25. Various types of partitions. The number of partitions into distinct odd summands, 
squares, cubes, triangular numbers, are cases of application of Meinardus’ method. For instance 
the method provides, for the number Q,, of partitions into distinct summands, the asymptotic 


form 
en" /n/3 
Qn ~ 4. 31/4n3/4° 
The central approximation is obtained by a Mellin analysis from 


0. —mt 


L(t) = log Q(e") = )> EV" -¢ 


l—e-™’ 
m=1 


L*(s) = —P(s)¢(s)¢(s + 1)(1- 2°), 


2 


T 1 
Ve en boon 
ig ~ 8 V2 + 55 


(See the already cited references [10, 13, 18, 260].) J 


L(t) 


> VIII.26. Plane partitions. A plane partition of a given number n is a two-dimensional array 
of integers n;,; that are nonincreasing both from left to right and top to bottom and that add up 
to n. The first few terms (ETS A000219) are 1, 1, 3,6, 13, 24, 48, 86, 160, 282, 500, 859 and P. 
A. MacMahon proved that the OGF is 


Meinardus’ method applies to give 
ae (C(3) 2722) 1/36 ,-25/36 exp (3 . 2-2/3¢(3)1/3n2/3 p 2c) 
where c = —7Sy(log(27) + y — 1). 
(See [10, p. 199] for this result due to Wright [504] in 1931.) <q 


> VIIL.27. Partitions into primes. Let P&™ be the number of partitions of n into summands 
that are all prime numbers, 


PM= || CHa, 


where pm is the mth prime (p1 = 2, p2 = 3, ...). The sequence starts as (ETS 4000607): 
1,0,1,1,1, 2,2, 3,3, 4, 5,6, 7,9, 10, 12, 14, 17, 19, 23, 26, 30, 35, 40. 


3 
I 


Then 
9\ V2 n \¥2 
OD fs 
(72) log Py (3) T (<<) (1+ 0(1)). 


An upper bound of a form consistent with (72) can be derived elementarily as a saddle point 
bound based on the property 
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This last fact results either from the Prime Number Theorem or from a Mellin analysis based 
on the fact that II(s) := )> p;° satisfies, with 4(m) the Mobius function, 
TI(s) = $~ p(m) log ¢(ms). 
m=1 


(See Roth and Szekeres’ study [415] as well as Yang’s article [512] for relevant references and 
recent technology.) This is in sharp contrast with compositions into primes (Chapter V, p. 317) 
whose analysis turned out to be especially easy. dq 


> VIII.28. Partitions into powers of 2. Let M,, be the number of partitions of integer n into 
summands that are powers of 2. Thus M(z) = ],,59(1—- z?")-1. The sequence (M,) starts 
as 1,1, 2,2, 4,4, 6,6, 10 (EIS A018819). One has — 


it n \? 1 1 log log 2 
log Man = ———~ | 1 = ———_ } 1 O(log | : 
metas 2 log 2 (108) +(S+ a+ log 2 banter Ollog lee) 


De Bruijn [110] determined the precise asymptotic form of M2,. (See also [140] for related 
problems.) dq 


Averages and moments. Based on the foregoing analysis, it is possible to per- 
form the analysis of several parameters of integer partitions in a way parallel to our 
discussion of moments in Subsection VIII. 5.3. In particular, it becomes possible to 
justify the empirical observations regarding the profile of partitions made in the course 
of Example IIL.6, p. 160. 


> VII.29. Mean number of parts in integer partitions. The mean number of parts (or sum- 
mands) in a random integer partition of size n is 


1 1/2 "E 
svn O K= =. 
K J/nlogn + O(n"/*), mS 
For a partition into distinct part, the mean number of parts is 


2V/3 log 2 
ae Vn + o(n'/?). 
The complex-analytic proof only requires the central estimates of log P(e~*) and log Q(e~*), 
given the concentration properties, as well as the estimates 
ee —logt+y 1 Matin je 8 log2 1 
em) ey 


l—e-™ t l-e7™ t 
m>1 m>1 


which result from a standard Mellin analysis, the respective transforms being 


T(s)¢(s)*,  P(s)(1—2'*)¢(s)?. 
Full asymptotic expansions of the mean and of moments of any order can be determined. In 
addition, the distributions are concentrated around their mean. (The first order estimates are 
due to Erdés and Lehner [155] who gave an elementary derivation and also obtained the limit 


distribution of the number of summands in both cases: they are a double exponential (for P) 
and a Gaussian (for Q).) 4 


VIII. 7. Large powers 


The extraction of coefficients in powers of a fixed function and more generally 
in functions of the form A(z) B(z)” constitutes a prototypical and easy application of 
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the saddle point method. We will accordingly be concerned here with the problem of 
estimating 


4 1 n a 
(73) [2™)Alz)- BY)" = s— f Al) Bz)" Sa 


as both n and WN get large. This situation generalizes directly the example of inverse 
factorials and the exponential, where we have dealt with a coefficient extraction equiv- 
alent to [z”|(e*)” (see pp. 515 and 520), as well as the case of the central binomial 
coefficients, where an estimate of [z”|(1 +z)?” is wanted (p. 515). General estimates 
relative to (73) are derived in Subsections VHI.7.1 (bounds) and VIII.7.2 (asymp- 
totics). We finally discuss perturbations of the basic saddle point paradigm in the case 
of large powers (Subsection VIII. 7.3): Gaussian approximations are obtained in a 
way that generalizes “local” versions of the Central Limit Theorem for sums of dis- 
crete random variables. This last subsection paves the way for the analysis of limit 
laws in the next chapter, where the rich framework of “quasi-powers” will be shown 
to play a central rdle in so many combinatorial applications. 


VIII.7.1. Large powers: saddle-point bounds. We consider throughout this 
section two fixed functions, A(z) and B(z) satisfying the following conditions: 

Ly: The functions A(z) = >) j30 a;z) and B(z) = ~y>0 b;z/ are analytic at 0 
and have nonnegative coefficients; furthermore it is assumed (without loss 
of generality) that B(O) 4 0. 

Lz: The function B(z) is aperiodic in the sense that gcd { j | b; > 0} = 1, 
(Thus B(z) is not a function of the form 3(z?) for some integer p > 0 and 
some (3 analytic at 0.) 

Ls: Let R < oo be the radius of convergence of B(z); the radius of convergence 
of A(z) is at least as large as R. 


Define the quantity T called the spread: 


cB'(x) 
74 PeSch ‘ 
(74) r-R- B(x) 
Our purpose is to analyse the coefficients 


[2"] A(z) Bz)”, 


when N and n are linearly related. The condition N < T'n will be imposed: it is both 
technically needed in our proof and inherent in the nature of the problem. (For B a 
polynomial of degree d, the spread is T = d; for a function B whose derivative at 
its dominant positive singularity remains bounded, the spread is finite; for B(z) = e* 
and more generally for entire functions, the spread is T’ = oo.) 

Saddle point bounds result almost immediately from the previous assumptions. 
Proposition VIII.9 (Saddle point bounds for large powers). Consider functions A(z) 
and B(z) satisfying the conditions Ly, L2, Lg above. Let X be fixed a positive number 
with 0 < A < T and let ¢ be the unique positive root of the equation 


BY) _ 
BO) 


VIII. 7. LARGE POWERS 549 


Then, with N = Xn an integer, one has 
[2*] A(z) - B(z)” < A) BIC"™E-%. 


PROOF. The existence and unicity of ¢ is guaranteed by an argument already encoun- 
tered several times (Note 44, p. 266, and Note 4, p. 516). The conclusion then follows 
by an application of general saddle point bounds (Corollary VIII.1, p. 514). 


EXAMPLE VIII.10. Entropy bounds for binomial coefficients. Consider the problem of esti- 
mating the binomial coefficients ( 2) for some A with 0 < A < 1. It is assumed that N = An 
is an integer. Proposition VIII.9 provides 


oS ="Jat+a"sa+gre™, 


where re SAG = yy: A simple computation then shows that 


2" i) <exp(nH(A)), where H(A) = —Alog — (1 — 4) log(1 — A) 


is the entropy function. Thus, for \ 4 }, the binomial coefficients ( hd are exponentially 
smaller than the central coefficient (72) and the entropy function precisely quantifies this 
exponential ‘Sap vs.n2. 20 vsee at agaicnd o Weekeistasend END OF EXAMPLE VIII.10. 


> VIII.30. Anomalous dice games. The probability of a score equal to An in n casts of an 
unbiased die is bounded from above by a quantity of the form e~" where 


K = —6 + log (=5) — (A= I logé, 


and ¢ is an algebraic function of \ determined by Pac — 9) =0. <J 


> VIIL31. Large deviation bounds for sums of random variables. Let g(u) = E(u*) be the 
probability generating function of a discrete random variable X > 0 and let pp = g’(1) be the 
corresponding mean (assume ju < 00). Set N = \n and let ¢ be the root of Cg’(¢)/g(¢) = » 
assumed to exist within the domain of analyticity of g. Then, for \ < pu, one has 


Dually, for A > py, one finds 


k>N ¢—1 
These are exponential bounds on the probability that n copies of the variable X have a sum 
deviating substantially from the expected value. dq 


VIII. 7.2. Large powers: saddle point analysis. The saddle point bounds for 
large powers are technically shallow but useful, whenever only rough order of magni- 
tude estimates are sought. In fact, the full saddle point method is applicable under the 
very conditions of the preceding proposition. 
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Theorem VIII.6 (Saddle point analysis for large powers). Under the conditions of 
Proposition VHI9, one has 


Bo)” 
CNH /Oant 


where ¢ is the unique root of ¢B’(¢)/B(¢) = X and 


(75) [2"]A(z) - B(z)” = ACC) (1 + o(1)). 


2 


d 
= qe (log B(¢) — Alog z). 

¢ 
In addition, a full expansion in descending powers of n exists. 

These estimates hold uniformly for in any compact interval of (0,T), i.e., any 


interval |X’, X"| with 0 < x’ < X" < T, where T is the spread. 


PROOF. We discuss the analysis corresponding to a fixed X. For any fixed r such that 
0 <r < R, the function |B(re’”)| is, by positivity of coefficients and aperiodicity, 
uniquely maximal at 9 = 0 (see The Daffodil Lemma on p. 253). It is also infinitely 
differentiable at 0. Consequently there exists a (small) angle 6, € (0,7) such that 


|B(re®)| < |B(re™)| for all 6 € (61, 7], 


and at the same time, | B(re’®)| is strictly decreasing for @ € [0, 41] (it is given by a 
Taylor expansion without linear term). 

We carry out the integration along the saddle point circle, z = Ce’’, where the 
previous inequalities on |B(z)| hold. The contribution for |6| > 0, is exponentially 
negligible. Thus, up to exponentially small terms, the sought coefficient is given 
asymptotically by J(61), where 


1 fet ; 
A(Ge'*) B(Ce*® Je”? do. 


J(9) = On J, 


It is then possible to impose a second restriction on @, by introducing 4 according to 
the general heuristic, namely, n63 — co, n63 — 0. We fix here 


A = Oo(n) = n 2/5, 


By the decrease of |B(¢e*®)| on [9, 91] and by local expansions, the quantity J(01) — 
J (60) is of the form exp(—cn1/°) for some c > 0, that is, exponentially small. 

Finally, local expansion are valid in the central range since 09 tends to 0 as n — 
oo. One finds for z = Ce’? and |O| < 9%, 


A(z) B(z)" ~ A(Q)B(Q)"¢-™ exp(—n€6?/2). 


Then the usual process applies upon completing the tails, resulting in the stated esti- 
mate. A complete expansion in powers of n~!/? is obtained by extending the expan- 
sion of log B(z) to an arbitrary order (like in the case of Stirling’s formula, p. 522). 
Furthermore, by parity, all the involved integrals of odd order vanish so that the ex- 
pansion turns out to be in powers of 1/n (rather than 1/./n). 
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EXAMPLE VIII.11. Central binomials and trinomials, Motzkin numbers. An automatic ap- 
plication of Theorem VIIL.6 is to the central binomial coefficient ea) = [z"|(1 + z)?”. In the 
same way, one gets an estimate of the central trinomial number, 


gntl/2 


2/mn 


Tn = [z"\(1 +2427)” shown to satisfy Ty, ~ 


The Motzkin numbers count unary-binary trees, 
M, = [z"|M(z) where M=2z(11+M+4M?). 


The standard approach is the one seen earlier based on singularity analysis as the implicitly 
defined function M(z) has an algebraic singularity of the /~type, but the Lagrange inversion 
formula provides an equally workable route. It gives 


Mn+1 = 


which is amenable to saddle point analysis via Theorem VIII.6, leading to 


n+1/2 
Mee 
2/73 


See below p. 552 for more on this theme. .................. END OF EXAMPLE VIII.11. 


We have opted for a basic formulation of the theorem with conditions on A and B 
that are not minimal. It is easily recognized that the estimates of Theorem VIII.6 
continue to hold, provided that the function | B(re*®)| attains a unique maximum on 
the positive real axis, when r € (0,T) is fixed and 0 varies on |—1, 7]. Also, in order 
for the statement to hold true, it is only required that the function A(z) does not vanish 
on (0,7), and A(z) or B(z) could then well be allowed to have negative coefficients: 
see Note 35. Finally, if A(¢) = 0, then a simple modification of the argument still 
provides precise estimates in this vanishing case; see Note 34 below. 
> VIII.32. Central Stirling numbers. The central Stirling numbers of both kinds satisfy 


2n 
n 


n! 


(2n)! (2n)! |} n 


! 
| ~ Ain? (1+ O(n~*)), att ~ cg Agn 1? (1+ O(n™*)), 


where A; = 2.45540, Az = 1.54413, and Ai, Ag are expressible in terms of special values of 
the Cayley tree function. Similar estimates hold for [Bal and ol : <q 
> VIII.33. Integral points on high-dimensional spheres. This note is based on an article by 
Mazo and Odlyzko [353]. Let L(n, c) be the number of lattice points (i.e., points with integer 
coordinates) in n-dimensional space that lie on the sphere of radius N = ,/an assumed to be 
an integer. Then, 


co 


L(n,a) = [z%JO(z)", where @(z) = Doz” =142502”. 


neZ n=1 


Thus, there are computable constants C’, D depending on a such that L(n, a) ~ Cn-¥2D", 
The number of lattice points inside the sphere can be similarly estimated. (Such bounds are 
useful in coding theory, combinatorial optimization, especially the knapsack problem, and cryp- 
tography.) dq 


552 VII. SADDLE POINT ASYMPTOTICS 


> VIII.34. Coalescence of a saddle point with roots of the multiplier. Fix ¢ and take a 

multiplier A(z) in Theorem VIII.6 such that A(¢) = 0, but A’(¢) 4 0. The formula (75) is 
then to be modified as follows: 

e" Ale) Be)” = (4) +64") 

CN+1, /onn3é3 


Higher order cancellations can also be taken into account. <q 


(1+ 0(1)). 


> VIII.35. A function with negative coefficients that is minimal along the positive axis. Take 


B(z) =14+2—2"° with |z| < <5 By design, B(z) has negative Taylor coefficients. On the 
other hand, |B(re’®)| for fixed r < 75 (say) attains its unique maximum at 0 = 0. The saddle 


point method still applies and an estimate of [z”]B(z)” is obtained by (75). dq 


Large powers: saddle points versus singularity analysis. In general, the La- 
grange inversion formula establishes an exact correspondence between two a priori 
different problems, namely, 


— the estimation of coefficients of large order in large powers, and 
— the estimation of coefficients of implicitly defined functions. 


In one direction, the Lagrange Inversion Theorem has the capacity of bringing 
the evaluation of coefficients of implicit functions into the orbit of the saddle point 
method. Indeed, let Y be defined implicitly by Y = zé(Y), where ¢ is analytic at 0 
and aperiodic. One has, by Lagrange, 


e™Y @) = fw" }o(w)", 


which is of the type (75). Then, under the assumption that the characteristic equa- 
tion ¢(7) — r¢’/(r) has a positive root within the disc of convergence of ¢, a direct 
application of Theorem VIIL.6 yields 


" p” oo EE _ /2¢(7) 
ee tame Poel? TG) 

This last estimate is equivalent to the statement of Theorem VII.2 on page 435, ob- 
tained there by singularity analysis. (As we know from Chapter VII, this provides 
the number of trees in a simple variety, with @ being the degree generating function 
of the variety.) This approach is in a few cases more convenient to work with than 
singularity analysis, especially when explicit or uniform upper bounds are required, 
since constructive bounds tend to be more easily obtained on circles than on variable 
Hankel contours (Note 36). 

Conversely, the Lagrange Inversion Theorem makes it possible to approach large 
powers problems by means of singularity analysis of an implicitly defined function’. 
This mode of operation can prove quite useful when there occurs a coalescence be- 


tween saddle points and singularities of the integrand (Note 37). 


4This is in essence an approach suggested by several sections of the original memoir of Darboux[106, 
§3-§5], in which “Darboux’s method” discussed in Chapter VI was first proposed. It is also of interest to 
note that a Lagrangean change of variables transforms a saddle point circe into a contour whose geometry 
is of the type used in singularity analysis. 
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> VIHI.36. An assertion of Ramanujan. In his first letter to Hardy, Ramanujan (1913) an- 
nounced that 


Dp PTE ba Suite cae, 
2 1! 2! (n-1)! 0 nl? 
where d= us + ao 
3 135(n +k)’ 
and k lies between = and =. Ramanujan’s assertion indeed holds for all n > 1; see [187] for 
a proof based on saddle points and effective bounds. dq 


> VII.37. Coalescence between a saddle-point and a singularity. The integral in 


_ 1 (l+y)” dy 

In = [y"|\ + y)?" a pS ee 
ly ( ' y) ( y) Din ay (l = y)@ yeti 

can be treated directly, but this requires a suitable adaptation of the saddle-point method, given 

the coalescence between a saddle point at 1 [the part without the (1 — y)® factor] and a 

singularity at that same point. Alternatively, it can be subjected to the change of variables 


z=y/(1+ y)*. Then y is defined implicitly by y = z(1 + y)?, so that 


se l+y dz _ pm It+y 
© OEE Joe (Lye Bie eg ee 


(1+a)/2 


Since y(z) has a square-root singularity at z = 1/4, the integrand is of type Z— , and 


92n—a 
T(*) 

In general, for ¢(y) satisfying the assumptions (relative to B) of Theorem VIII.6, one 
finds, with 7 : ¢(7) — T¢’(r) = 0), 

1 if oy)” dy (22)’ oe 
2im Jor (G(T) — O(y))® y” T ra): 

Van der Waerden discuses this problem systematically in [479]. See also Section VIII. 9 below 
for other coalescence situations. <q 


ao nlerD/2. 


VIII. 7.3. Large powers: Gaussian forms. Saddle point analysis has conse- 
quences for multivariate asymptotics and it constitutes a direct way of establishing 
that many discrete distributions tend to the Gaussian law in the asymptotic limit. For 
large powers, this property derives painlessly from our earlier developments, espe- 
cially Theorem VIII.6, by means of a “perturbation” analysis. 


First, let us examine a particularly easy problem: How do the coefficients of 
NZ 


[2% ]e"* vary as a function of N when n is some large but fixed number? These 
coefficients are 
(n) Nj] nz ns 
Cn = [z Je => Wr 


By the ratio test, they have a maximum when N = n and are small when N differs 
significantly from n; see Figure 9. The bell-shaped profile is also apparent on the fig- 
ure and is easily verified by elementary real analysis. The situation is then parallel to 
what is already known of the binomial coefficients on the nth line of Pascal’s triangle, 
corresponding to [2 ](1 + z)” with N varying. 

The asymptotically Gaussian character of coefficients of large powers is actually 
universal amongst a wide class of analytic functions. We prove this within the frame- 
work of large powers already investigated in Subsection VIII. 7.1 and consider the 
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FIGURE VIII.9. The coefficients [z” ]Je”*, when n = 100 is fixed and N = 0..200 
varies, have a bell-shaped aspect. (The coefficients are normalized by e~”.) 


general problem of estimating the coefficients [z] (A(z) - B(z)”) as N varies. In ac- 
cordance with the conditions on p. 548, we postulate the following: (Li): A(z), B(z) 
are analytic at 0, have nonnegative coefficients, and are such that B(0) 4 0; (Le): 
B(z) is aperiodic; (L3) The radius of convergence R of B(z) is a minorant of the 
radius of convergence of A(z). We also recall that the spread has been defined as 
T :=lim,_.r- cB’ (x)/B(a). 

Theorem VIII.7 (Large powers and Gaussian forms). Consider the “large powers” 
coefficients: 


(76) Off) := [2%] (A(z) - B(z)”). 


Assume that the two analytic functions A(z), B(z) satisfy the conditions (Li), (L2), 
and (L3) above. Assume also that the radius of convergence of B satisfies R > 1. 
Define the two constants: 


BY) 2_ BY) BW) | Gar 


BQ)’ 7° ~ Bay Ba) \ BO 


b= ) (o > 0). 


Then the coefficients aie for fixed n as N varies have an asymptotically Gaussian 
profile in the precise sense that for N = un + x4/n, there holds (as n + ox) 


1 (n) 1 —ax? /(207) —1/2 
= __ 1 
ADBOP Oo’ = 5 Jaan’ ( POM )) 


uniformly with respect to x, when x belongs to a finite interval of the real line. 


(77) 


PROOF. We start with a few easy observation that shed light on the global behaviour 
of the coefficients. First, since R > 1, we have the exact summation, 
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which explains the normalization factor in the estimate (77). Next, by definition of the 
spread and since R > 1, one has 
Bil : 
a Deen ai, ey 
B(1) a—R- B(x) 


given the general property that «B’(x)/B(a) is increasing. Thus, the estimation of 
the coefficients in the range N = pun + O(,/n) falls into the orbit of Theorem VIII.6 
which expresses the results of the saddle point analysis in the case of large powers. 
Referring to the statement of Theorem VIII.6, the saddle point equation is 
BY) _ BM) « 


BO) BO) vr 
with ¢ a function of x and n. For x in a bounded set, we thus have ¢ ~ 1 as n — on. It 
then suffices to effect an asymptotic expansion of the quantities ¢, A(¢), B(¢), € in the 
saddle point formula of Equation (75). In other words, the fact that NV is close to pun 
induces for ¢ a small perturbation with respect to the value 1. With a; := AG) (1) and 


b; := B© (1), one finds mechanically 


b2 x 
= 14+ —A_, = + o(n"! 
7 bobe + bob by? Jn we) 
x 
= bo +———2—, + O(n-*””), 
cH °'' On bobs + bob — bie ae 


and so on. The statement follows. 
Take first A(z) = 1. In the particular case when B(z) is the probability generating 
function of a discrete random variable Y, one has B(1) = 1, and the coefficient 
jt = B’'(1) is the mean of the distribution. The function B(z)” is then the probability 
generating function (PGF) of a sum of n independent copies of Y. Theorem VIII.7 
then describes a Gaussian approximation of the distribution of the sum near the mean. 
Such an approximation is called a local limit law, where the epithet “local” refers 
to the fact that the estimate applies to the coefficients themselves. (In contrast, an 
approximation of the partial sums of the coefficients by the Gaussian error function 
is known as a central limit law or as an integral limit law.) In the more general 
case where A(z) is also a probability generating function of a nondegenerate random 
variable (i.e., A(z) 4 1), similar properties hold and one has: 
Corollary VIII.3 (Local limit law for sums). Let X be a random variable with prob- 
ability generating function (PGF) A(z) and Y,,..., Yn be independent variables with 
PGF B(z), where it is assumed that X and the Y; are supported on Z>o. Assume that 
A(z) and B(z) are analytic in some disc that contains the unit disc in its interior and 
that B(z) is aperiodic. Then the sum, 


ke Cee ee a 


satisfies a local limit law of the Gaussian type: For x in any finite interval, one has 


—t?/2 


P (Sn = [un + to/n]) = a 


(14+ 0(n-/”)). 
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PROOF. This is just a restatement of Theorem VIII.7, setting x = to and taking into 
account A(1) = B(1) = 1. 
Gaussian forms for large powers admit many variants. As already pointed out 
in Section VIII. 4, the positivity conditions can be greatly relaxed. Also, estimates 
for partial sums of the coefficients are possible by similar techniques. The asymp- 
totic expansions can be extended to any order. Finally, suitable adaptations of The- 
orems VIII.6 and VIII.7 make it possible to allow x to tend slowly to infinity and 
manage what is known as a “moderate deviation” regime. We do not pursue these as- 
pects here since we shall develop a more general framework, that of “Quasi-Powers” 
in the next chapter. 
> VIIL38. An alternative proof of Corollary VII.3. The saddle point ¢ is near 1 when N is near 


the centre N ~ yn. It is alternatively possible to recover the ce by Cauchy’s formula upon 
integrating along the circle |z| = 1, which is then only an approximate saddle point contour. 
This convenient variant is often used in the literature, but one needs to take care of linear terms 
in expansions. Its origins go back to Laplace himself in his first proof of the local limit theorem 
(which was expressed however in the language of Fourier series as Cauchy’s theory was yet 
to be born). See Laplace’s treatise Théorie Analytique des Probabilités [328] first published in 
1812 for much fascinating mathematics. dq 


VIII. 8. Saddle points and probability distributions 


Saddle point methods are useful not only for estimating combinatorial counts, but 
also for extracting probabilistic characteristics of large combinatorial structures. In the 
previous section, we have already encountered the large powers framework, giving rise 
to Gaussian laws. In this section, we further examine the way a saddle point analysis 
can serve to quantify properties of random structures. There is an extreme diversity of 
possible situations, which partly defy classification. so that we must content ourselves 
with the discussion of a single representative example. (A good rule of thumb is once 
more that the saddle point method is likely to succeed in cases involving some sort 
of exponential growth of GFs.) Note that problems of a true multivariate nature will 
be examined in the next chapter specifically dedicated to multivariate asymptotics and 
limit distributions. 


EXAMPLE VHI.12. Capacity in occupancy problems. This example is relative to random 
allocations, occupancy statistics, and balls-in-bin models, as already introduced in Chapter II. 
We limit ourselves here to saddle point bounds. (The various regimes of the distribution are 
well covered in [316, pp. 94—115].) 

Assume that n balls are thrown into n bins, uniformly at random. How many balls does the 
most filled bin contain? We shall in fact deal with a generalized version of the problem where 
n balls are thrown into m bins, in the regime n = am for some fixed a in (0, +00). The size 
of the most filled bin will be called the capacity and we let C’,,m denote the random variable, 
when all m” tables are taken equally likely. Under our conditions a random bin contains on 
average a constant number, a, of balls. The proposition below proves that the most filled bin 
has somewhat more, as exemplified by Figure 10. 


Proposition VIII.10. Let n and m tend simultaneously to infinity, with the constraint that 

= =a remains constant. Then, the expected capacity satisfies 
logn 

log logn 


1_logn 
2 log logn 


(14 0(1)) <E{Cnm} <2 (1 + 0(1)). 
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FIGURE VIII.10. Three random allocations of n = 50 balls in m = 50 bins. 


In addition, the probability of capacity to lie outside the interval determined by the lower and 
upper bounds tends to 0 as m,n — od. 

PROOF. We detail the proof when a = 1 and abbreviate C,, = Cn jm, the generalization to 
a # 1 requiring only simple adjustments. From Chapter II, we know that 

n! 

P{Cn Sb} = —[z"I(e0(z)) 


P{Cn > b} 


(78) 


II 
~ 
© 

3 
XN 
| 
a 
iq) 
o 
=~ 
S 
Rie 
= 
3 
= 


where e,(z) is the truncated exponential: 


The two equalities of (78) permit us to bound the left and right tails of the distribution. As 
suggested by the Poisson approximation of balls-in-bins model, we decide to adopt saddle point 
bounds based on z = 1. This gives 


PIC, <b} < a (22) 
(79) C,>b} < a (1 7 (2)") 
We set 
(80) pr(n) = (ae) 


This quantity represents the probability that n Poisson variables of rate 1 all have value b or less. 
(We know for elementary probability theory that this should be a reasonable approximation of 
the problem at hand.) A weak form of Stirling’s formula, namely, nie” <2 /mn (n> 1), 
then yields an alternative version of (79), 


(81) P{C, <b} < 2/anNpo(n) 
P{C, >b} < 2 /an(1— pr(n)). 
For fixed n, the function p(n) increases steadily from e~” to 1 as b varies from 0 to oo. 


In particular, the “transition region” where p,(n) stays away from both 0 and 1 is expected to 
play a réle. This suggests defining bo = bo(n) such that 


b!<n< (bo + 1)!, 


so that 


BOOS logn 


7 meinen en 
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We also observe that, as n, b — oo, there holds 


asf n en feck : 
(eee(1)) ie (: ~ B+ t Gm) 
op (- Baa + Geap)): 


Left tail. We take b = | 5bo| and a simple computation from (82) shows that for n large 


enough, po(n) < exp(—~W/n). Thus, by the first inequality of (81), the probability that the 
capacity be less than 4bo is exponentially small: 


po(n) 
(82) 


(83) P{Cn < Sbo(n)} < 2Vanexp(— WR). 


Right tail. Take b = 2bo. Then, again from (82), for n large enough, one has 1 — p(n) < 


1 — exp(—4) = 4(1 + o(1)). Thus, the probability of observing a capacity that exceeds 2bo 


is vanishingly small, and is O(n-*/?), Taking next b = 2b9 +r with r > 0, similarly gives the 
bound 


(84) P{Cy, > 2bo(n) +r} < 2/2 ( a) 


The analysis of the left and right tails in Equations (83) and (84) now implies 


E{Cn} < dbu(n) + 9-24] (bu(n))" = 2bo(ny(1 + o(1) 


(85) Bisgoh i 
E{Cn} > Y_ [1-2Vanexp(-¥n)] = 5b0(n)(1 + o(1)). 
r=0 


This justifies the claim of the proposition when a = 1. The general case (a 4 1) follows 
similarly from saddle point bounds taken at z = a. 
The saddle point bounds described above are obviously not tight: with some care in deriva- 


tions, one can show by the same means that the distribution is tightly concentrated around its 
mean, itself asymptotic to log n/ log log n. In addition, the saddle point method may be used 
instead of crude bounds. These results, in the context of longest probe sequences in hashing, 
were obtained by Gonnet [242] under the Poisson model. Many key estimates regarding random 
allocations (including capacity) are to be found in the book by Kolchin et al. [316]. Analyses of 
this type are also useful in evaluating various dynamic hashing algorithms by means of saddle 
point methods [171, 408]. ......... 0... eee eee eee END OF EXAMPLE VIII.12. 


VIII. 9. Multiple saddle points 


We conclude this chapter with a discussion of higher order saddle points, ac- 
companied by brief indications on what are known as phase transitions or critical 
phenomena in the applied sciences. 

Multiple saddle point formula. All the analyses carried out so far have been in 
terms of simple saddle points, which represent by far the most common situation. In 
order to get a feel of what to expect in the case of multiple saddle points, consider first 
the problem of estimating the two real integrals, 


1 1 
Ihc= | (1 —2”)" da, Jn = | (1 — 23)" de. 
0 0 
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FIGURE VIII.11. Two views of a double saddle point also known as “monkey saddle”. 


(For the purpose of this discussion, we ignore the fact that the integrals can be eval- 
uated in closed form by way of the Beta function.) The contribution of any interval 
(xo, 1] is exponentially small, and the ranges to be considered on the right of 0 are 
about n~'/? and n—'/, respectively. One thus sets 


C= for In, c= for Jn. 


t é 
an Tn 


Then local expansions apply, tails can be completed in the usual way, to the effect that 


1 2 1 as 3 
Ih~= “© dt, In~ | dt. 
ail : Tr dy © 


The integrals reduce to the Gamma function integral, which provides 


ir(4 17(4 
Rae iw 
2 nz 3 3 


The repeated occurrences of $ in the quadratic case and of 4 in the cubic case stand 
out. The situation in the cubic case corresponds to the Laplace method for integrals 
being used when a multiple critical point is present. 

What has been just encountered in the case of real integrals is typical of what to 
expect for complex integrals and saddle points of higher orders. Consider, for simplic- 
ity, the case of a double saddle point of an analytic function F'(z). At such a point ¢, 
we have F'(¢) # 0, F’(¢) = F’(¢) = 0, and F’”(¢) # 0. Then, there are three 
steepest descent lines emanating from the saddle point and three steepest ascent lines. 
Accordingly, one should think of the landscape of | F'(z)| as formed of three “valleys” 
separated by three mountains and meeting at the common point ¢. The characteristic 
aspect is that of of a “monkey saddle” (comparable to a saddle with places for two 
legs and a tail) and is displayed in Figure 11. We can then enounce a modified form 
of the saddle point formula of Theorem VHI.3. 


Theorem VIII.8. Consider an integral ie F(z) dz, where the integrand F = ef is 
an analytic function depending on a large parameter and A, B lie in adjacent valleys 
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across a double saddle point ¢, which is a root of the saddle point equation 


fC) =f") = 9, 


or, equivalently, F"(¢) = F”(¢) = 0. Assume that the contour C connecting A to B 
can be split into C = C) UC in such a way that the following conditions are 
satisfied: (i) the tail integral {...) is negligible; (ii) in the central domain CO, a 
cubic approximation holds, 


fle) =F + yF"(OE- 9) + Om) 


with Hp, — 0 as n — oo uniformly; (iti) tails can be completed back. Then one has 


B te) T(4) ef 
(86) —— € dz ~ tw — — 
Qin Ja 2in 3/ lf” (O)| 
where w is a cube root of unity (w> = 1) dependent upon the position of the valleys 
of A and B, while the sign + depends on orientation. 


PROOF. The proof is a simple adaptation of that of Theorem VIII.3. The heart of the 
matter is now the integration of 


[ew (Fre - 0°) ae, 


with C composed of the two rays te?/7/3 and te"J+)7/3, for t € Ryo. 
> VIII.39. Higher-order saddle points. For a saddle point of order p + 1, the saddle point 


formula reads 
B m4 £(6) 
= | ef) dz ~ +w (p) E 


Qin J 4 in FDOT 


where w? = 1. <J 


> VIIM.40. Vanishing multipliers and multiple saddle points. This note supplements Note 39. 


For a saddle point of order p + 1 and an integrand of the form (z — ¢)°- e!), the saddle point 
formula must be modified according to 


ove) b+1 
gee tt /P! dg = aa pT bat ; 
0 p \p! p! 


Thus, the argument of the I factor is changed from “ to SE as is the exponent of | f‘??(¢)| 
and of n in the case of large power estimates. dq 

Forests and coalescence of saddle points. We give below an application to the 
counting of forests of unrooted trees made of a large number of trees. The analysis 
precisely involves a double saddle point in a certain critical region. The problem is 
in particular relevant to the analysis of random graphs during the phase where a giant 


component has not yet emerged. 


EXAMPLE VIII.13. Forests of unrooted trees. The problem here consists in determining the 
number F;,,,, of ordered forests, i.e., sequences, made of m (labelled, nonplane) unrooted trees 
and comprised of n nodes in total. The number of unrooted trees of size n is, by virtue of 
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a 


FIGURE VIII.12. The function H governing the exponential rate of the number of 
forests exhibits a “phase transition” at a = $ (eft); the quantity + log(Finyn/n!) as a 
function of a = m/n for n = 200 (right). 


Cayley’s formula, n”~* and its EGF is expressed as U = T — T? /2, where T is the Cayley 
tree function satisfying T = ze”. Consequently, we have 


1 = Na Fay yA re ode 
J Fnn = [2"I (1) - : ) =e [(t- =) 


Like in the analytic proof of the Lagrange Inversion Theorem it proves convenient to adopt 
t = T as an independent variable, so that z = te~* becomes a dependent variable. Since 
dz = (1—t)e~*, this provides the integral representation: 
1 1 ee re dt 
—Fan=— t—=t e”'(1—t) —. 
ny Die Jo ( 2 ) ( ) inti 
The case of interest here is when m and n are linearly related. We thus set m = an, where a 
priori a € (0, 1). Then, the integral representation of Fy, becomes 
1 dt t 


= Phat) (7 op) — ha(t) = alog(1— —)+t 1) logt 
ee a at) (t) = alog(1— 5) +t+(a—1)logt, 


87) = Fn = 
nN: 


where C' encircles 0. This has the form of a “large power” integral. Saddle points are found as 
usual as zeros of the derivative h/,; there are two of them given by 


Co = 2 — 2a, c=. 


Fora < 3 one has Go > ¢1 while for a > s the inequality is reversed and Gy < ¢1. In both 
cases, a simple saddle point analysis succeeds, based on the saddle point nearer to the origin; 
see Note 41 below. In contrast, when a = 3, the points Co and ¢; coalesce to the common 
value 1. In this last case, we have h’, (1) = h’{ (1) = 0 while h’{'(1) = —2 is nonzero: there is 
a double saddle point at 1. ° ° : 

The number of forests thus presents two different regimes depending on whether a < s 
ora > i, and there is a discontinuity of the analytic form of the estimates at a = 3. The 
situation is reminiscent of “critical phenomena” and phase transitions (e.g., from solid to liquid 
to gas) in physics, where such discontinuities are encountered. This provides a good motivation 


: cnet tt wa]? _1 
to study what happens right at the “critical” value a = 5. 
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FIGURE VIII.13. A plot of e” with the double saddle point at 1 (left). The 
level lines of e!’ with valleys (the region higher than e’)) is darkened) and a legal 
integration contour (right). 


We thus consider the special value a = 3 and set h = hi. What is to be determined is 


2 
therefore the number of forests of total size n that are made of n/2 trees, assuming naturally n 
even. Bearing in mind that the double saddle point is at € = Co = ¢; = 1, one has 


h(z) =1— =(z-1)? + O((z-1)*) — (z 1). 


Thus, upon neglecting the tails and localizing the integral to a disc centred at 1 with radius 6 = 
6(n) such that 


né°* — 00, no’ +0 
(6 = n~*/"° is suitable), we have the asymptotic equivalence (with y representing z — 1) 


1 erl—F log 2) 


(88) —F nv = -———— / ery dy + exponentially small, 
D 


nl? Qin 
where D is a certain (small) contour containing 0 obtained by transformation from C’. 

The discussion so far has left aside the choice of the contour C’ in (87), hence of the 
geometric aspect of D near 0, which is needed in order to fully specify (88). Because of the 
minus sign in the third derivative, h’”’(1) = —2, the three steepest descent half lines stemming 
from | have angles 0, eos en Ts suggest to adopt as original contour C’ in (87) two 
symmetric segments stemming from | connected by a loop left of 0; see Figure 13. Elementary 
calculations justify that the contour can be suitably dimensioned so as to remain always below 
level h(1). See also the right drawing of Figure 13 where the level curves of the valleys below 
the saddle point are drawn together with a legal contour of integration that winds about 0. 

Once the original contour of integration has been fixed, the orientation of D in (88) is fully 
determined. After effecting the further change of variables y = wn'/3 and completing the 
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tails, we find 
1 A n(1—4 log 2) 1 -y?/3 

(89) mem ~N Pa 2 § A= yeas oe y dy, 
where E connects ooe~?‘*/? to 0 then to coe?’/3, The evaluation of the integral giving ) is 
now straightforward (in terms of the Gamma function), which gives: 
Proposition VIII.11. The number of forests of total size n comprised of n/2 unrooted Cayley 
trees satisfies 

1 9.371/3 Ts ay 

sFajan ~ a ies z! Bn 2/3. 

n! , 1 ( 3 ) 

The number three is characteristically ubiquitous in the formula. (Furthermore, the formula 


displays the exponent 2 instead of 3 in the general case (88) because of the additional factor 
(1 — z) present in the integral representation (87), which vanishes at the saddle point 1; see 
Note40.) nt hete slice ce ems Livable grange ote rks beats END OF EXAMPLE VIII.13. 


The problem of analysing random forests composed of a large number of trees has 
been first addressed by the Russian School, most notably Kolchin and Britikov. We 
refer the reader to Kolchin’s book [315, Ch. I] where nearly thirty pages are devoted 
to a deeper study of the number of forests and of associated parameters. Kolchin’s 
approach is however based on an alternative presentation in terms of sums of indepen- 
dent random variables and stable laws of index 3. As it turns out there is a striking 
parallel with the analysis of the growth of the random graph in the critical region, 
when the random graph stops resembling a large collection of disconnected tree com- 
ponents. 

An almost sure sign of (hidden or explicit) monkey saddles is the occurrence of 
T(3) factors in the final formule and cube roots in exponents of powers of n. It is in 
fact possible to go much further than we have done here with the analysis of forests 
(where we have stayed right at the critical point) and provide asymptotic expressions 
that describe the transition between regimes, here from A"n~!/?, to B’n-?/3, then 
to C"n—"/?, The analysis then appeals to the theory of coalescent saddle points well 
developed by applied mathematicians (see, e.g., the exposition in [59, 381, 502]) and 
the already evoked réle of the Airy function. We do not pursue this thread further since 
it properly belongs to multivariate asymptotics. It is exposed in a detailed manner in 
an article of Banderier, Flajolet, Schaeffer, and Soria [22] relative to the size of the 
core in arandom map from, on which our presentation of forests has been modelled. 

The results of several studies conducted towards the end of the previous millen- 
nium do suggest that, amongst threshold phenomena and phase changes, there is a fair 
amount of universality in descriptions of combinatorial and probabilistic problems by 
means of multiple and coalescing saddle points. In particular T(3) factors and the 
Airy function surface recurrently in the works of Flajolet, Janson, Knuth, Luczak and 
Pittel [193, 282], which are relative to the Erdés—Renyi random graph model in its 
critical phase; see also [205] for a partial explanation. The occurrence of the Airy area 
distribution (in the context of certain polygon models related to random walks) can be 
related to this orbit of techniques, as first shown by Prellberg [401], and strong nu- 
merical evidence evoked in Chapter V (p. 349) suggests that this might extend to the 
difficult problem of self-avoiding walks [411]. Airy-related distributions also appear 
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in problems relative to the satisfiability of random boolean expressions [61], the path 
length of trees [461, 459, 460], as well as cost functionals of random allocations[200]. 
The reasons are sometimes well understood in separate contexts by probabilists, statis- 
tical physicists, combinatorialist, and analysts, but a global framework is still missing. 
> VIHI.41. Forests and simple saddle points. When0 <a < ;, the number of forests 
satisfies, for some computable C_ (a). 

1 eH —(o) 

gas ale 


H_(a) = 1-alog2. 


When 4 <a < 1, the number of forests satisfies, for some computable C; (a), 


1 eli+ (2) 
abn ~ C+(a)T H4(a) = aloga +2 — 2a + (a — 1) log(2 — 2a). 
This results from a routine simple saddle point analysis at ¢; and Co respectively. dq 


VIII. 10. Perspective 


One of the pillars of classical analysis, the saddle-point method plays a major 
role in analytic combinatorics. It provides an approach to coefficient asymptotics and 
can handle combinatorial classes, which are not amenable to singularity analysis. The 
simplest case is that of urns, whose generating function e* has no singularities at a 
finite distance. Similar functions commonly arise as composed SET constructions. 
Broadly speaking, for the class of generating functions that arise from the combinato- 
rial constructions of Part A of this book, singularity analysis is effective for functions 
that have moderate growth at their singularities; the saddle point method is effective 
otherwise. 

The essential idea behind the saddle point method is simple, and it is very easy 
to get good bounds on coefficient growth. In effect, the Cauchy coefficient integral 
defines a surface with a well-defined saddle point somewhere along the positive real 
axis, and choosing a circle centred at the origin and passing through the saddle point 
already provides useful bounds by elementary arguments. The essence of the full 
saddle point method is the development of more precise bounds, which are obtained 
by splitting the contour into two parts and balancing the associated errors. 

Combinatorial classes that are amenable to saddle-point analysis have so far only 
been incorporated into relatively few schemas, compared to what we saw for singu- 
larity analysis. The consistency of the approach certainly argues for the existence of 
many more such schemas. A positive signal in that direction is the fact that several 
researchers have developed concepts of admissibility that serve to delineate classes of 
function for which the saddle point method boils down to verifying simple conditions. 

The saddle point method also provides insights in more general contexts. Most 
notably, the general results on analysis of large powers lay the groundwork for distri- 
butional analyses and limit laws, which are the subject of the next chapter. 


Saddle point methods take their sources in applied mathematics, one of them being the 
asymptotic analysis by Debye (1909) of Bessel functions of large order. Saddle point analy- 
sis is sometimes called steepest descent analysis, especially when integration contours strictly 
coincide with steepest descent paths. Saddle points themselves are also called critical points 
(i.e., points where a first derivative vanishes). Because of its roots in applied mathematics, the 
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method is well covered by the literature in this area, and we refer to the books by Olver [381], 
Henrici [265], or Wong [502] for extensive discussions. A vivid introduction to the subject is to 
be found in De Bruijn’s book [111]. We also recommend Odlyzko’s impressive survey [376]. 

To a large extent, saddle point methods have made an irruption in combinatorial enumera- 
tions in the 1950’s. Early combinatorial papers were concerned with permutations (involutions) 
or set partitions: this includes works by Moser and Wyman [367, 368, 369] that are mostly 
directed towards entire functions. 

Hayman’s approach [263] which we have exposed here (see also [502]) is notable in its 
generality as it envisions saddle point analysis in an abstract perspective, which makes it possi- 
ble to develop general closure theorems. A similar thread was followed by Harris and Schoen- 
feld who gave stronger conditions then allowing full asymptotic expansions [261]; Odlyzko 
and Richmond [378] were successful in connecting these conditions with Hayman admissibil- 
ity. Another valuable work is Wyman’s extension to nonpositive functions [511]. 

Interestingly enough, developments that parallel the ones in analytic combinatorics have 
taken place in other regions of mathematics. Erwin Schrédinger introduced saddle point meth- 
ods in his lectures [431] at Dublin in 1944 in order to provide a rigorous foundation to some 
models of statistical physics that closely resemble balls-in-bins models. Daniels’ publica- 
tion [105] of 1954 is a historical source for saddle point techniques in probability and statistics, 
where refined versions of the central limit theorem can be obtained. (See for instance the de- 
scription in Greene and Knuth’s book [250].) Since then, the saddle point method has proved a 
useful tool for deriving Gaussian limiting distributions. We have given here some idea of this 
approach which is to be developed further in a later chapter, where we shall discuss some of 
Canfield’s results [76]. Analytic number theory also makes a heavy use of saddle point analysis. 
In additive number theory, the works by Hardy, Littlewood, and Ramanujan relative to integer 
partitions have been especially influential, see for instance Andrews’ book [10] and Hardy’s 
Lectures on Ramanujan [260] for a fascinating perspective. In multiplicative number theory, 
generating functions take the form of Dirichlet series while Perron’s formula replaces Cauchy’s 
formula. For saddle point methods in this context, we refer to Tenenbaum’s book [468] and his 
seminar survey [467]. 

A more global perspective on limit probability distributions and saddle point techniques 
will be given in the next chapter as there are strong relations to the quasi-powers framework de- 
veloped there, to local limit laws, and to large deviation estimates. General references for some 
of these aspects of the saddle point method are the articles of Bender-Richmond [27], Can- 
field [76], and Gardy [227, 228, 229]. Regarding multiple saddle points and phase transitions, 
we refer the reader to references provided at the end of Section VIII. 9. 
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Multivariate Asymptotics and Limit 
Distributions 


Un probléme relatif aux jeux du hasard, 
proposé a un austére janseniste par un homme du monde, 
a été a I’origine du Calcul des Probabilités!. 


— SIMEON-DENIS POISSON 
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Analytic combinatorics’ deals with the enumeration of combinatorial structures 
in relation to algebraic and analytic properties of generating functions. The most ba- 
sic cases are the enumeration of combinatorial classes and the analysis of moments of 
combinatorial parameters. These involve generating functions in one (formal or com- 
plex) variable as discussed extensively in previous chapters. They are consequently 
essentially univariate problems. 

Many applications, in combinatorics as well as in the applied sciences, require 
quantifying the behaviour of parameters of combinatorial structures. It is typically 
useful to know that a random permutation of size n has a number of runs whose 
average (mean) equals to (n + 1) /2, but it may be equally important to know to which 


LN problem relative to games of chance proposed to an austere Jansenist by a man of the world has 
been at the origin of the calculus of probabilities.” Poisson refers here to the fact that questions of betting 
and gambling posed by the Chevalier de Méré (who was both a gambler and a philosopher) led Pascal (an 
austere religious man) to develop some the first foundations of probability theory. 

?Warning: This chapter is still in a very preliminary state (November 2004). It is only included at 
this stage in order to illsutrate the global architecture of Analytic Combinatorics. 
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extent such an average is representative of what occurs in simulations or on actual 
data that obey the randomness model. As a matter if fact, in a random permutation of 
size n = 1,000, it is found that there are about 70% chances that the number of runs 
be in the interval 990 .. 1010. Even more dramatically, for runs and a permutation of 
size n = 1,000 still, there is probability less than 10~° to observe a case that deviates 
by more than 10% from the mean value; this probability decreases to about 10~°° for 
n = 10,000, and is even less than 10~®°? for n = 100,000. As illustrated by such 
numeric data, there is obvious interest in analysing the “central” region near the mean, 
as well as in quantifying the risk of finding instances that deviate appreciably from the 
expected value. These are now typically bivariate problems. 

It is frequently observed that the histograms of the distribution of a combinatorial 
parameter (for varying size values) exhibit a common characteristic “shape’’. In this 
case, we say that there exists a limit law, which may be of the discrete or the continu- 
ous type. Our aim here is to detect such limit laws, and a few examples have already 
appeared scattered in this book, in the case where they can be reduced to a collection 
of univariate analyses. This chapter provides a coherent set of analytic techniques 
dedicated to extracting coefficients of bivariate analytic functions. The mathematics 
combine methods of complex asymptotic analysis as previously exposed with a small 
selection of fundamental theorems from the analytic side of classical probability the- 
ory. 

In simpler cases, limit laws are discrete and, when this happens, they often belong 
to the geometric or Poisson type. In many other cases, limit laws are continuous, a 
prime example being the Gaussian law associated with the famous bell-shaped curve, 
which surfaces so frequently in elementary combinatorial structures. The goal of this 
chapter is to offer a fundamental analytic framework for extracting limit laws from 
combinatorics. 

Symbolic methods provide bivariate generating functions for many natural pa- 
rameters of combinatorial structures. Analytically, the auxiliary variable marking the 
combinatorial parameter under study then induces a deformation of the (univariate) 
counting generating function. This deformation may affect the type of singularity that 
the counting generating function presents in various ways. A perturbation of univari- 
ate singularity analysis is then often sufficient to derive an asymptotic estimate of the 
probability generating function of a given parameter, when taken over objects of some 
large size. Continuity theorems from probability theory finally allow us to conclude 
on the existence of a limit law. 

An especially important component of this paradigm is the framework of “Quasi- 
Powers”. Large powers tend to occur for coefficients of generating functions (think 
of quantities of the form ~ p~” that arise from radius of convergence bounds). The 
collection of deformations of a single counting generating function is then likely to 
induce for the corresponding coefficients a collection of approximations that involve 
large powers together with small error terms—these are referred to as quasi-powers. 
From there, a Gaussian laws is derived along lines that are somewhat reminiscent of 
the classical central limit theorem of probability theory (expressing the asymptotically 
Gaussian character of sums of independent random variables). 
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The direct relation that can be established between combinatorial specifications 
and asymptotic properties, in the form of limit laws, is especially striking, and it is 
a characteristic feature of analytic combinatorics. In fact, almost any classical law 
of probability theory and statistics is likely to occur somewhere in analytic combina- 
torics. Conversely, almost any simple combinatorial parameter is likely to be governed 
by an asymptotic law. 


IX. 1. Limit laws and combinatorial structures 


What is given is a combinatorial class C, labelled or unlabelled, and an integer 
valued combinatorial parameter y. There results both a family of probabilistic models, 
namely for each n the uniform distribution over C,, that assigns to any y € C,, the 
probability 


i 
Py) = ZG — with Cy = card(Cn), 


n 
and a corresponding family of random variables obtained by restricting y to C,,. Under 
the uniform distribution over C,,, we then have 


1 
Pe. (x=k)= Ge card {7 ECn | x(y) = hy 


We write Pc, to indicate the probabilistic model relative to C,,, but also freely abbre- 
viate it to P,, or write P(y,) whenever C is clear in context. 

As n increases, the histograms of the distributions of . often share a common 
profile; see Figure | for two characteristic examples that we discuss next. Our purpose 
is to relate such phenomena to the analysis of bivariate generating functions provided 
by the symbolic method. 


Binary words. Let us start by discussing the case of binary words with two simple 
parameters, one leading to a discrete law, the other to a continuous limit. The ex- 
ample is purposely chosen simple enough that explicit expressions are available for 
the probability distributions at stake. Nonetheless, it is typical of the approach taken 
in this chapter, and, once equipped with suitably general theorems, it is hardly more 
difficult to discuss the number of leaves in a nonplane unlabelled tree or the number 
of summands in a composition into prime summands. 


Take the class W,, of binary words of length n over the alphabet {a, b} and con- 
sider the two parameters for w € W: 


x(w) := number of initial a’s in w, &(w) := total number of a’s in w. 


Explicit expressions are available for the counts and 


: 1 1 
Pwnr(X =k) = ari lO S k< n] + 5alk =n], 


Pw, (€=k) = = (i): 


The probabilities relative to y then resemble, in the asymptotic limit of large n, the 
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ry § ¢ 


FIGURE IX.1. Histograms of probability distributions for the number of initial 
a’s in a random binary string (parameter y, left) and the total number of a’s 
(parameter €, right). The histogram corresponding to y is not normalized and 
direct convergence to a discrete geometric law is apparent; for €, the horizontal 
axis is scaled to n, and the histograms quickly conform to the bell-shaped curve 
that is characteristic of a continuous gaussian limit. 


geometric distribution. One has, for each k, 
1 
Jim, Pw, (vy =k) = 5K 

(In this simple case, it is even true that the limit is exactly attained as soon as n > 
k;.) We say that there is a limit law of the discrete type for x, this limit law being a 
geometric. 

In contrast, the parameter € has mean j1,, := n/2 and a standard deviation o,, := 
4 mn. One should then centre and scale the parameter y, introducing (over W,,) the 
“standardized” (or “normalized’’) random variable 

Xe ny? 7 i. 
3vn 

which can be considered to lie in a fixed scale. It then becomes possible to examine 
the behaviour of the (cumulative) distribution function P(X; < y) for some fixed y. 
In terms of x itself, this means that we are considering P(E < uu, + yon) for real 
values of y. Then, the classical approximation of the binomial coefficients yields the 
approximation: 


1 y 2 
1 lim P(E < pn + Yon -= | er oa 
(1) se Sh Gane 
which can be derived by summation from the “local” approximation 
> 1 n ev /2 
(2) (an sae) en 


We now say that there is a limit law of the continuous type for €, this limit law being a 
Gaussian. 
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Though cases mixing the discrete and the continuous are theoretically conceivable 
(a rare instance arises in the theory of map enumerations and “cores”, see [22]), the 
discrete—continuous dichotomy applies to most combinatorial cases of interest. 


Distributional properties. As illustrated by the previous discussion, there are two 
major types of convergence that the discrete distribution of a combinatorial parameter 
may satisfy: 


Discrete ——> Discrete and Discrete —> Continuous |. 


In accordance with the general notion of convergence in distribution (or weak con- 
vergence, see APPENDIX C: Convergence in law, p. 722), we shall say that a limit 
law exists for a parameter if there is convergence of the corresponding family of (cu- 
mulative) distribution functions. In the broad context of convergence of probability 
laws, one also speaks of a central limit law when such a convergence holds. In the 
discrete-to-discrete case, convergence is established without standardizing the random 
variables involved. In the discrete-to-continuous case, the parameter should be centred 
at its mean and scaled by its standard deviation, like in (1). 

There is also interest in obtaining a local limit law, which, when available, quan- 
tifies individual probabilities and probability densities, like in (2). The distinction 
between local and central limits is immaterial in the discrete-to-discrete case, where 
the existence of one type of law implies the other. In the discrete-to-continuous case, 
it is technically more demanding to derive a local limit law than a central one, as 
stronger analytic properties are required. 

The speed of convergence in a limit law describes the way the finite combinatorial 
distributions approach their asymptotic limit. It provides useful information on the 
quality of asymptotic approximations for finite m models. 

Finally, quantifying the “risk” of extreme configurations necessitates estimates on 
the tails of the distributions, that is, the behaviour of the probability distribution far 
away from its mean. Such estimates are also called large deviation estimates. Large 
deviation theory constitutes a useful complement to the study of central and local 
limits, as exemplified by the discussion of runs in the introduction to this chapter. 


In the remainder of the this chapter, we shall first examine the situation of dis- 
crete limits. After this, several sections will be dedicated to the case of continuous 
limits, with special emphasis on limit laws of the Gaussian type. In each of the two 
cases, the discussion of central laws starts with a continuity theorem, which states 
conditions under which convergence in law can be established from convergence of 
transforms. (The transforms in question are probability generating functions for the 
discrete case, characteristic functions or Laplace transforms otherwise). Refinements, 
known as the Berry-Esseen inequalities when the limit law is continuous, then re- 
late speed of convergence of the combinatorial distributions to their limit on the one 
hand, a distance between transforms on the other hand. Put otherwise, distributions 
are close if their transforms are close. Large deviation estimates are often obtained 
by a technique of “shifting the mean”, which is familiar in probability and statistics. 
The last section gives brief indications on the occurrence of non-Gaussian laws in the 
discrete-to-continuous scenario. 
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Limit laws and bivariate generating functions. In this chapter, the starting point of 
a distributional analysis is invariably a bivariate generating function 


F(z, u) = S- fag 2”, 
n,k 


where f,,,, represents (up to a possible normalization factor) the number of structures 
of size n in some class F. What is sought is asymptotic information relative to the 
array of coefficients 

fig = (ra Fea), 
which could in principle be approached by an iterated use of Cauchy’s coefficient 


formula, 
ie r\2 dz du 
etullFe.u) = (<) [| *eo saa 


Thus, a double coefficient extraction is to be effected. It turns out that it is in general 
arduous if not unfeasible to approach a bivariate counting problem in this way, so that 
another route is explored throughout this chapter’. 

First, observe that the specialization at u = 1 of F(z, u) gives the counting gen- 
erating function of F, that is, F(z) = F(z,1). Next, as seen repeatedly starting 
from Chapter III, the moments of the combinatorial distribution {f,,,} for fixed n 
and varying k are attainable through the partial derivatives at u = 1, namely 


2 
first moment — F(z, u) 5 second moment <> 


Ou 
In summary: Counting is provided by the bivariate generating function F(z, wu) taken 
at u = 1; moments result from the bivariate generating function taken in an infinites- 
imal neighbourhood of u = 1. 
Our approach to limit laws will be as follows. 


zu) 


u=1 


aut : 


u=1 


Estimate the (unormalized) probability generating function 
fr(u) = S- fa,eu® = [2"| F(z, u). 
k 


This is viewed a single coefficient extraction (extracting the coefficient 
of z”) but parameterized by u. Thanks to the availability of continuity the- 
orems, the following can be proved for a great many cases of combinatorial 
interest: The existence and the shape of the limit law derive from an analysis 
of the bivariate generating function F(z, wu) taken in a fixed neighbourhood 
of u = 1. In addition, thanks to Berry—Esseeen inequalities, the quality 
of an asymptotic estimate for f,(u) translates into a speed of convergence 
estimate for the corresponding laws. Also, for the discrete-to-continuous 
case, local limit laws derive from consideration of the bivariate generating 
function F(z, u) taken on the whole of the unit circle, |u| = 1. Finally, 
large deviation estimates are seen to arise from estimates of f,(u) when u 


34 collection of recent works by Pemantle and coauthors [388, 389, 390] shows however that a well- 
defined class of bivariate asymptotic problems can be attacked by the theory of functions of several complex 
variables and a detailed study of the geometry of a singular variety. 
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Local Limit 


PA Moments 


Region Property 
Counting u=1 Counting 
w= le Moments 
2 u € V(1) (neighb.) | Central Limit Law 
cee ae jul =1 Local Limit Law 
Central Limit u € [a, 8] Large deviations 


Large deviations 
(left) 


FIGURE IX.2. The correspondence between regions of the u—plane and asymp- 
totic properties of combinatorial distributions. 


is real and u < 1 (left tail) or wu > 1 (right tail). This is to large extent 
a reflection of saddle point bounds. In summary: Large deviations are re- 
lated to the behaviour of F'(z,u) for real values of u in an interval (a, (] 
containing u = 1. 
The correspondence between u-domains and properties of laws is summarized in Fig- 
ure 2. 


Singularity perturbation. As seen throughout Chapters [V—VIII, analytic combina- 
torics approaches the univariate problem of counting objects of size n starting from 
the Cauchy coefficient integral, 
i ak dz 

The singularities of F(z) can be exploited, whether they are of a polar type (Chap- 
ters IV and V), algebraic-logarithmic (Chapters VI and VII) or essential and amenable 
to saddle point methods (Chapter VIII). It is in this way that asymptotic forms of 
[z”|F'(z) are derived. 

From the discussion above, crucial information on combinatorial distributions 
is accessible from the bivariate generating function F'(z,u) when wu varies in some 
domain containing 1. This suggests to consider F(z, u) not so much as an analytic 
function of two complex variables, where z and u would play a symmetric rédle, but 
rather as a collection of functions of z indexed by a secondary parameter wu. In other 
words, F'(z, wu) is considered as a deformation of F() = F(z, 1) when uw varies in a 
domain containing u = 1. Cauchy’s coefficient integral gives 


fr(u) = [2")F (2, 4) = = | Pew —- 


We can then examine the way the parameter u affects the analysis of singularities 
performed in the aymptotic counting problem of estimating [z"|F'(z,1). Such an 
approach is called a singularity perturbation analysis. It consists in tracing the effect 
of a perturbation by wu on the standard singularity analysis assocaited to the univariate 
problem. 
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The essential feature of the analysis of coefficients by means of complex tech- 
niques as seen in Chapters [V—VIII is to be “robust”. Being based on explicit estimates 
of contour integrals, it is usually amenable to smooth perturbations whose effect can 
be traced throughout calculations. Explicit estimates normally result (though added 
care in estimations is needed to ensure uniformity). In this chapter, we are going to 
see many applications of this strategy. 

Regarding binary words and the two parameters y (initial run of a’s) and € (total 
number of a’s), the general strategy of singularity perturbation instantiates as follows. 
In the case of W,,, there are two components in the BGF 


1 l-z 
l—uoz |1—2z/ 


Wy (z, uo) = 


and, in essence, the dominant singular part—a simple pole at z = 1/2—arises from 
the second component, which does not change when up varies. Accordingly, one has 
1 1 
2 n 2 n 
W,.(z, uo) ae T—e V), [z ]W,.(z, uo) ~ 1-2 2". 
The probability generating function of . over W,, is then obtained upon dividing by 


2—”, and 


dy 2 4 ==. ob 
gp "Wa (2s uo) ~ Pag = DO pe 
k=0 


where the last expression is none other than the probability generating function of 
a discrete law, namely, the geometric distribution of parameter 4. As we shall see 
in section IX.2 where we enounce a continuity theorem for probability generating 
functions, this is enough to conclude that the distribution of 7 converges to a geometric 
law. 

In the second case, that of We, the auxiliary parameter modifies the location of 


the singularity, 
1 


Then, the singular behaviour is strongly dependent upon a singularity at 


1 
p(uo) = vas) 


that moves as k varies, while the type of singularity (here a simple pole) remains the 
same. Accordingly, the coefficients obey a “large power law” (here of an exact type) 
and, as regards the probability generating function of € over W,,, one has 


HL le"|Wel20) = (=~) 

an BOE Nati) | 

This analytical form is reminiscent of the central limit theorem of probability theory 
after which large powers, corresponding to sums of a large number of independent 
random variables, entail convergence to a Gaussian law. By continuity theorems for 
integral transforms exposed in Sections IX. 4, there results a continuous limit law of 
the Gaussian type in this case. 


We(z, uo) => 
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F(z,u) whenu®1 Type of law Method and classes 
Sing. + expo. fixed Discrete limit Subcritical composition § XXX 
(Neg. binom, Poisson, ...) Subcritical Seq., Set, ... § XXX 
Sing. moves, expo. fixed Gaussian(n, n) Supercritical composition § Xxx 
— —— Meromorphic perturb. § XXX 
—_ — (Rational fns) § XXX 
— —_— Sing. Analysis pertub. § XXX 
— — (Alg., implicit fns) § XXX 
Sing. fixed, expo. moves Gaussian(log n, log n) Exp-log struct. § XXX 
— — (Differential eq.) § XXX 
Sing. + expo. move Gaussian [Gao-Richmond [226]] 
Essential singularity often Gaussian Saddle point perturbation § xxx 
Discontinuous singular type non-Gaussian (Various cases) § XXX 
— Stable Critical composition § XXX 


FIGURE IX.3. A rough typology of bivariate generating functions F'(z, u) and 
limit laws. 


The foregoing discussion suggests that a “minor” perturbation of bivariate gener- 
ating function that affects neither the location nor the nature of the singularity could 
lead to a discrete limit law. A “major” change in exponent or even like here in loca- 
tion is likely to be conducive to a continuous limit law, of which the prime example 
is the normal distribution. Figure 3 outlines a typology of limit laws in the context 
of bivariate asymptotics. A bivariate generating function F(z, u) is to be analysed. 
The deformation induced by u may affect the type of singularity that F(z, u) has in 
various ways. An adapted complex coefficient extraction then provides various types 
of limit laws. 


IX. 2. Discrete limit laws 


Take a class C on which a parameter x is defined. This determines for each n 
a random variable X,,, which is y restricted to C,,, where C,, is endowed with the 
uniform probability distribution. In this section, we give the general definitions and 
results that are suitable for the discrete-to-discrete situation, where a discrete parame- 
ter tends without normalization to a discrete distribution. The corresponding notion of 
convergence is given in Subsection IX. 2.1. Probability generating functions (PGFs) 
are important since, by virtue of a continuity theorem stated in Subsection IX. 2.2, 
convergence in law results from convergence of PGFs. At the same time, the fact that 
PGFs of two distributions are close entails that the original distribution functions are 
close. Finally, large deviation estimates for a distribution can be easily related to an- 
alytic continuation of its PGFs, a fact introduced in Subsection IX. 2.3. This section 
organizes some general tools and accordingly we limit ourselves to a single combina- 
torial application, that of the number of cycles of some small fixed size in a random 
permutation. The next section will provide a number of deeper applications to random 
combinatorial structures. 
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IX. 2.1. Convergence to a discrete law. In order to specify precisely what a 
limit law is, we base ourselves on the general context described in APPENDIX C: 
Convergence in law, p. 722. The principles exposed there provide for what should be 
the “right” notion convergence of a family of discrete distributions to a limit discrete 
distribution. Here is a self-standing definition. 

Definition [X.1 (Discrete-to-discrete convergence). The random variable Xj, (su- 


ported by Z>9) is said to converge in law to a discrete variable Y supported by Z>o 
if for each k > 0, one has 


(3) lim P(X, <k)=P(Y<k), ie, lim So png =Soqy, 


n— Co 


where Prn,k = P(Xn < k) and q, := P(X = k). One also says that the parameter x 
onC admits a limit law of type X. 
Convergence is said to take place at speed €,, if 


(4) sup |) P(Xn = 5) — D4] Sen, 


I<k I< 


The condition in (3) can be rewritten in terms of the distribution functions F;,,G 

of X,,Y as 
lim F,,(k) = G(k), 

pointwise for each k. When such a property of type (3) relative to distribution func- 
tions holds, it is also called a “central” limit law. (One good reason for this termi- 
nology is that convergence of distribution functions is principally informative in the 
“central part’ of the distribution, where a fair proportion of the probability mass lies.) 
By differencing, the condition of (3) is clearly equivalent to the condition that, for 
each k, 


(5) lim Pnjk = dk» 
and 6, is called a local speed of convergence if 
sup IPn,k = qk | < Of? 
k 


The property (5) is said to constitute a local limit law, as probabilities py, are esti- 
mated “locally”. Thus: For the convergence of a discrete law to a discrete law, there 
is complete equivalence between the existence of central and local limits. Note 1 
below shows elementarily that there always exists a speed of convergence that tends 
to 0 as n tends to infinity. In other words, plain convergence of distribution functions 
or of individual probabilities implies uniform convergence (this is in fact a general 
phenomenon). 

> IX.1. Uniform convergence. Local and central convergences to a discrete limit law are always 
uniform. In other words, there always exists speeds €n, dn tending to 0 as n — oo. 

Assume simply the condition (3) and its equivalent form (5). Fix a small e« > 0. First 
dispose of the tails: there exists a ko such that Does dk < €, so that pee dt > 1—e. Now, 
by simple convergence, there exists an no such that, for all n larger than no and each k < ko, 
IPn,k — Gk| < €/ko. Thus, we have keke Pn,k > 1 — 2e, hence esis Dnk < 2e. In 
other words, 57,5, ke and >? 5; Pn, are both in [0, 2¢]. There results that convergece of 
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distribution functions is uniform, with speed 5e¢ at most. At the same time, the local speed 6, is 
at most 4e. <J 


[> IX.2. Speed in local and central estimates. Let My be the spread of x on C,, defined as 
M,, := max-yec,, x(7). Then, a speed of convergence in (4) is given by 


én = Mnbn + oS Dk. 
k>Mn 
(Refinements of these inequalities can obtained from tail estimates detailed below.) <J 
> IX.3. Total variation distance. The total variation distance between X and Y is classically 
1 
dry(X,Y):= sup |Py(E)—Px(B)|=55—|P(Y =k) -P(X =A). 
ECZ>0 2 k>0 


(Equivalence between the two forms is established elementarily by considering the particular 
FE for which the supremum is attained.) The argument of Note 1 shows that convergence in 
distribution also implies that the total variation distance between X,, and X tends to 0. In 
addition, by Note 2, one has dry (Xn, X) < Mnodn + Deer a Dk: <J 


> IX.4. Escape to infinity. The sequence X,,, where 
P{Xn=0}=5, P{Xn=1=5, PXn=n}= a, 


does not satisfy a discrete limit law in the sense above, although lim, P{X, = k} exists for 
each k. Some of the probability mass escapes to infinity and, in a way, convergence takes place 
in ZU {+oo}. J 
A highly plausible indication of the occurrence of a discrete law is the fact that 
Hm = O(1), on = O(1). Examination of initial entries in the table of values of the 
probabilities will then normally permit one to decide whether a limit law holds. 


EXAMPLEIX.1. Singleton cycles in permutations. The case of the number of singleton cycles 
(cycles of length 1) in a random permutation of size n illustrates the basic definitions and it can 
be analysed with minimal analytic apparatus. The exponential BGF is 

exp(z(u — 1) 


l-<z : 


P(z,u) = 


which determines the mean jz, = 1 and the standard deviation o, = 1 (for n > 2). The table 
of numerical values of the probabilities pp, = [z"u*]P(z, u) immediately tells what goes on. 


0.333 0.250 0.000 0.041 

0.375 0.166 0.083 0.000 0.008 
0.367 0.183 0.061 0.015 0.003 
0.367 0.183 0.061 0.015 0.003 


The exact distribution is easily extracted from the bivariate GF, 


a es = 1 n—k e* dn—k 
Pra = (P(e, u) = ole = SE, 


where n!d,, is the number of derangements of size n, that is, 


Asymptotically, one has d, ~ e~1. Thus, for fixed k, we have 


lim Pn,k = Pr; Pk = —- 
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As a consequence, the distribution of singleton cycles in a random permutation of large size 
tends to a Poisson law of rate \ = 1. 


Convergence is quite fast. Here is a table of differences, bn,~ = Pn,k — Se 


The speed of convergence is easily bounded. One has d, = e~' + O(1/n!), by the alternating 
series property, so that 


en 1 et 1 fn et 2° 
me +0(gaom) = at? (2(z)) =r +0(7r): 
As a consequence, one obtains local (6,,) and central (€n) speed estimates 


i. =0(3), = 0(*). 
n! n! 


These bounds are quite tight. For instance one computes that 659 = 1.5 107°? while the quan- 
tity 2” /n! evaluates to 3.7107 °°. kok ee es END OF EXAMPLE IX. 1. 


IX. 2.2. Continuity theorem for PGFs. A higher level approach to discrete 
limit laws in analytic combinatorics is based on asymptotic estimates of p,,(u), the 
PGF of the random variable X,,. If, for sufficiently many values of wu, one has 


Pn(u) > a(u) (n> +00), 


one can infer that the coefficients pp, = [u*]pn(u) (for any fixed k) tend to the limit 
qx with generating function g(u). A continuity theorem for characteristic functions 
describes precisely sets of conditions under which convergence of probability gener- 
ating functions to a limit entails convergence of coefficients to a limit, that is to say 
the occurrence of a discrete limit law. We state here a continuity theorem with very 
general analytic conditions. 


Theorem IX.1 (Continuity Theorem, discrete laws). Let Q be an arbitrary set con- 
tained in the unit disc and having at least one accumulation point in the interior of the 
disc. Assume that the PGFs pn(u) = >> ps9 Pn,bu® and q(u) = SS qru® are such 
that there is convergence, 


lim pp(u) = q(w), 


n—+0o 


pointwise for each u in Q. Then a discrete limit law holds in the sense that, for each k, 


oan DP ni - Ds 5 


J<k J<k 


PROOF. The p,(u) are a priori analytic in |u| < 1 and uniformly bounded by 1 in 
modulus throughout |u| < 1. Vitali’s Theorem is a classical result of analysis whose 
statement (see [469, p. 168] or [265, p. 566]) is as follows: 


Vitali’s theorem. Let F be a family of analytic functions defined in a re- 
gion S (i.e., an open connected set) and uniformly bounded on every com- 
pact subset of S. Let { f;,} be a sequence of functions of F that converges 
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FIGURE IX.4. The PGFs of singleton cycles in random permutations of size n = 
4,8, 12 (left to right and top to bottom) illustrate convergence to the limit PGF of 
the Poisson(1) distribution (bottom right). Here the modulus of each PGF for 
|R(u)|, |S(u)| < 3 is displayed. 


ona setQ C S having a point of accumulation q € S. Then { f,} converges 
in all of S, uniformly on every compact subset T C S. 
Here, S is the open unit disc on which all the p,(w) are bounded. The sequence in 
question is {p,,(u)}. By assumption, there is convergence of p,,(u) to g(u) on 2. 
Vitali’s theorem implies that this convergence is uniform in any compact subdisc of 
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the unit disc, for instance, |u| < 4. Then, Cauchy’s coefficient formula provides 


1 / (u) du 
— -——— UuU)—-—_— 
dk in Ien/2 yktl : 
(6) ote we es 
eee 217 yan yeti 
= lim pyr. 


Uniformity granted by Vitali’s theorem combined with continuity of the contour inte- 
gral (with respect to the integrand) establishes the statement. 

Feller gives the sufficient set of conditions: p,(u) — q(u) pointwise for all real 
u €0, 1[; see [161, p. 280] for a proof that only involves elementary real analysis. It 
is perhaps surprising that very different sets can be taken, for instance, 


a=[-$-3], O={2} 9={4t +H}. 


The next statement relates a measure of distance between two PGFS, p(w) and 
q(u) to the distance betwen distributions. It is naturally of interest when quantifying 
speed of convergence to the limit in the discrete-to-discrete case. 


Theorem IX.2 (Speed of convergence, discrete laws). Consider two discrete laws 
supported by Z>0, with corresponding distribution functions F(x),G(x) and proba- 
bility generating functions p(u), q(u). 

(i) Assume that the laws have first moments. Then, for any T € (0,7), one has, 
for some absolute constants c = 4, 
+ Ip(et*) — a(e*)| c 

sup|F(k) Ge se f PMN sup ple) ale"). 
k aT T<\|t\|<a 

(it) Assume that p(w) and q(u) are analytic in |u| < p for some p > 1. Then, for 

any r satisfying 1 <r < p, one has 


sup | F(R) — G(A)| < ofr) sup plu) — aw], ofr) = 
k fetl=F r(r a 1) 
PROOF. (2) Observe first that p(1) = q(1) = 1, so that the integrand is of the form 
2 at u = 1. By APPENDIX C: Transforms of distributions, p. 718, the existence 
of first moments, say jz and v, implies that, for small t, one has p(e”’) — q(e’) = 
(44 — v)t + o(t), so that the integral is well defined. 
For any given k, Cauchy’s coefficient formula provides 


(8) F(k) = G(k) = pe | MA 


1 t 


where ¥ is the circle |u| = 1. (The factor (1 — u)~! sums coefficients.) Set u = e? 
and split the interval of integration accordingly. For all t, one has 


et — 1 


7 
<-. 
<3 
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This makes it possible to replace (1 — u)~! by 1/t, up to a constant multiplier. The 
statement follows upon splitting the interval of integration according to |t] < T and 
|t| > T, and then applying trivial bounds. 

(iz) Start again from (8), but integrate along |u| = r. Trivial bounds provide the 
statement. The first form is universal holds with 
strictly minimal assumptions (existence of expectations); the second form is a priori 
only usable for distributions that have exponential tails. In the context of limit laws, 
the first form of the theorem serves to relate the distance on the unit circle between 
the PGF p,,(u) of a combinatorial parameter and the limit PGF q(x) to the speed of 
convergence to the limit law. (In this sense, it prefigures the Berry-Esseen inequalities 
discussed in the continuous context below.) 


EXAMPLE IX.2. Cycles of length m in permutations. Let us first revisit the case of singleton 
cycles, m = 1, in this new light. The BGF P(z,u) = e*“~») /(1 — z) has for each wu a simple 
pole at z = 1 and is otherwise analytic in C \ {1}. Thus, a meromorphic analysis provides 
instantly, pointwise for any fixed u, 


[2e"|F(z,u) =e") + O(R™), 


with any R > 1. This, by the continuity theorem, Theorem IX.1, implies convergence to a 
Poisson law. 

Next, one should estimate a distance between characteristic functions over the unit circle. 
One has (for u = e”” 
ez(u-1) _ e(u-1) 

1-—z 

There is a removable singularity at z = 1. Thus, integration over the circle |z| = 2 in the 
z-plane coupled with trivial bounds yields 


n 


Pn(u) — q(u) = [2"] 


Ipn (us) — a(u)| < 27" sup fe" — e-9] = 0 (2-1 — ul). 
|z|=2 
One can then apply Theorem IX.2 with an arbitrary choice of T to the effect that a speed of 
convergence to the limit is O(2~”). (Any O( R~”) is possible by the same argument.) 
This approach generalizes to the number of m-cycles in a random permutation. The ex- 
ponential BGF is 
elu-b2™"/m 


l-z 


Then, singularity analysis of the meromorphic function of z (for wu fixed) gives immediately 


F(z,u) = 


lim [2"]F(z,u) =e" )/™, 
The right side of this equality is none other than the PGF of a Poisson law of rate \ = =. The 
continuity theorem and the first form of the speed of convergence theorem then imply: 


Proposition IX.1 (m-Cycles in permutations). The number of m—cycles in a random permuta- 
tion of large size converges in law to a Poisson distribution of rate 1/m with speed of conver- 


gence O(R~") for any R > 1. 


This vastly generalizes our previous observations on singleton cycles. 
END OF EXAMPLE IX.2. 
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>> IX.5. Poisson law for rare events. Consider the Bernoulli distribution with PGF (p + qu)”. 
If qg depends on n in such a way that g = A/n for some fixed \, then the limit law of the 
Bernoulli random variable is Poisson of rate 4. (This “law of small numbers” explains the 
Poisson character of activity in radioactive decay as well as the probability of accidental deaths 
of soldiers in the Prussian army resulting from the kick of a horse [Bortkiewicz, 1898].) J 


IX. 2.3. Large deviations. In the case of discrete limit laws, the study of large 
deviations is related to saddle-point bounds and is consequently often quite easy. We 
give with a general statement which is nothing but a rephrasing of saddle point bounds 
(Chapter IV) in the context of discrete probability distributions. 


Theorem IX.3 (Large deviations, discrete laws). Let p(w) = E(u*) be a probability 
generating function that is analytic for |u| < r where r is some number satisfying 
r > 1. Then, the following “local” and “central” large deviation bounds hold: 


p(r) 
P(X =k)< — P(X > k) < ———_.. 
(X=) <7E, P(X >a) <a 
PROOF. The local bound is a direct consequence of saddle point bounds given in 
Chapter IV. The central bound derives from the equality 


1 d 1 du 1 du 
P(X >k) == Abe apc Rie mT ee) pee eae _ ae 
oe) Qin [pe ( regan ) ukt2 Qin [peo ukt+l(u—1)’ 


upon applying trivial bounds. 

In accordance with this theorem and as is easily checked directly, the geometric 
and the negative binomial laws have exponential tails; the Poisson law has a “super- 
exponential” tail, being O(r~") for any r > 1, as the PGF is entire. (See definitions 
in APPENDIX C: Special distributions, p. 720.) By their nature, the bounds can be 
simultaneously applied to a whole family of probability generating functions. Hence 
their use in obtaining uniform estimates in the context of limit laws. The bound pro- 
vided always exhibits a geometric decay in the value of k—this is both a stength and 
a limitation on the method. 


IX.3. Combinatorial instances of discrete laws 


In this section, we focus our attention on a general analytic schema based on com- 
positions. The subcritical case of this schema is such that the perturbations induced 
by the secondary variable (u) affect neither the location nor the nature of the basic 
singularity involved in the univariate counting problem. The limit laws are then of the 
discrete type: for sequences, labelled sets, and labelled cycles, theese limit laws are in- 
variably of the negative binomial (NV B[2]), Poisson, and geometric type, respectively. 
Additionally, it is easy to describe the profiles of combinatorial objects resulting from 
such subcritical constructions. 


First, we consider the general composition schema, 


F(z,u) = g(uh(z)). 


This schema expresses over generating functions the combinatorial operation G[H] of 
substitution of components H enumerated by h(z) inside “templates” G enumerated 
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by g(z). (See Chapters I and II for the unalabelled and labelled versions.) The variable 
z marks size as usual, and the variable marks the size of the G template. 

We assume globally that g and h have nonnegative coefficients and that h(0) = 0 
so that the composition g(h(z)) is well-defined. We let p, and pp, denote the radii of 
convergence of f and g, and define 
(9) Tg = lim g(a) and T = lim h(a). 

EBPg Lp, 
The (possibly infinite) limits exist due to nonnegativity of coefficients. As already 
seen in Chapter VI, three cases are to be distinguished. 


Definition IX.2. The composition schema g(uh(z)) is said to be: subcritical if tT, < 
Pq Critical if tT, = pg, Supercritical if T, > pg. 


In terms of singularities, the behaviour of g(h(z)) at its dominant singularity 
is dictated by the dominant singularity of g (subcritical case), or by the dominant 
singularity of f (supercritical case), or it should involve a mixture of the two (critical 
case). This section discusses the subcritical case. First, a general statement about 
subcritical compositions: 


Proposition IX.2 (Subcritical composition). Consider the bivariate composition 
scheme F(z,u) = g(uh(z)). Assume that g(z) and h(z) satisfy the subcriticality 
condition T, < Pg, and that h(z) has a unique singularity at p = pp, on its disc of 
convergence, which is of the algebraic—logarithmic type 


h(z) =7—e(1— sy to (a = =\*) 


where T = Tp, c€ Rt, 0 < \ < 1. Then, a discrete limit law holds, 


lim Lon dk Ok = kgut** 
noo fn Ghaey y= 
ug’ (ru) 
with probability generating function q(u) = eG 
g(r 


What stands out is that, via its PGE, the limit law is a direct reflection of the derivative 
of the outer function involved in the composition. 

PRrooF. First, for the univariate problem, since g(z) is analytic at 7, the function 
g(h(z)) is singular at pp, and is analytic in a A-domain. Its singular expansion is 
obtained by composing the regular expansion of g(z) at 7 with the singular expansion 
of h(z) at pp: 


F(z) = g(h(z)) = 9(7) — eg'(7)(1 — 2/p)*(1 + 0(1)). 


Thus, F(z) satisfies the conditions of singularity analysis, and 


(10) fn = [2"|F(z) = es n-1(1 + o(1)). 


Also, the mean and variance of the distribution are clearly O(1). 
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Next, for the bivariate problem, fix any u with, say, u € (0,1). The BGF F'(z, u) 
is still singular at z = p, and its singular expansion obtained from F(z, wu) = g(uh(z)) 
by composition, is 

F(z,u) =g(uh(z)) = g(ur —cu(1 — z/p)* + iC z/p)*)) , 

= glur) — cug'(ur)(1 — 2/p)* + o((1 — 2/p)”). 
Thus, singularity analysis implies immediately: 

fm 24) _ ug'(ur) 

im ———— = ——__ 


noo [z"|F(z,1) g(r) 


By the continuity theorem for PGFs, this is enough to imply convergence to the dis- 
crete limit law with PGF ug’(ru)/g’(7), and the proposition is established. 
Under the subcritical composition scheme, it is also true that the tails have a 
uniformly geometric decay. Let uo be any number of the interval (1, 9,/7),). Then 
f(%, uo) aa function of z is analytic near the origin with a dominant singularity at pp, 
obtained by composing the regular expansion of g with the singular expansion of h: 


f(z, uo) = h(uoth) — ch'(uoTn)(1 — z/p)* + o((1 — z/p)”). 
There results the asymptotic estimate 


[2"] F(z, Uo) 
[2"] F(z, 1) 


Thus, for some constant K = K (uo), one has 


Pn k = ~ h'(uoTn)- 


Dn(uo) < K. 

It is easy also to verify that p,, (uw) is analytic at uo, so that, by Theorem IX.3, 

K (uo) 
= 


Dn k < K (uo) - ia Spe — a 


j>k ae 
Thus the combinatorial distributions satisfy uniformly (with respect to n) a large de- 
viations bound. In particular the probability that there are more than a logarithmic 


number of components satisfies 
(11) Pn(x > logn) = O(n~*), 6 = log uo. 


Such tail estimates may additionally serve to evaluate the speed of convergence to the 
limit law (as well as the total variation distance) in the subcritical composition schema. 


> IX.6. Semi-small powers and singularity analysis. Let h(z) satisfy the stronger singular 
expansion 


h(z) = 7 —c(1— 2/p)* + O(1— z/p)”, 
for0 < A <v <1. Then, fork < Clogn (some C > 0), the results of singularity analysis 
can be extended (as stated and proved in Chapter VI, they are only valid for fixed k) 


[2")h(z)* = kep-"'n- (1 ES O(n-")) 


for some 6; > 0, uniformly with respect to k. [The proof recycles all the ideas of Chapter VI 
and only needs some care in checking uniformity with respect to k of the major steps.] dq 
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> IX.7. Speed of convergence in subscritical compositions. Combining the exponential tail 
estimate (11) and local estimates deriving from the singularity analysis of “semi-small” powers 
in the previous notes, one obtains for the distribution functions associated with pn,~. and px the 
speed estimate 

L 
nor . 


sup | Fa(k) — F(®)| < 


There, L and 62 are two positive constants. <J 


In the labelled universe, the functional composition schema encompasses the se- 
quence, set, and cycle constructions. It suffices to take for the outer function g in the 
composition g o h the quantities 


We state: 


Proposition IX.3 (Subcritical constructions). Consider the constructions of sequence 
G(H), whether labelled or not, labelled set $8(H) and labelled cycle €(H) Assume 
the subcriticality conditions of the previous proposition, namely T < 1, T < o, 
7 <1, respectively, where T is the singular value of h(z). Then, the distribution of 
the number x of components determined by fn,~/ fn, is such that x = 1+ Y admits a 
discrete limit law that is of type, respectively: negative binomial N B[2], Poisson, and 
geometric. For k > 1, the limit form for qx, = lim, P(Y = k) are respectively 


7k 


ae = (1—7)?(k+ 1r*, % = 8" Op % = (1—r)r*. 
In an object of positive size, the number of components is always > 1. In terms of 
the standard definition of the three laws (APPENDIX C: Special distributions, p. 720) 
the distribution of the number of components is y = 1 + Y where Y is supported by 
Z>0. 
PROOF. In accordance with Proposition [X.2 and Equation (12), the PGF of the dis- 
crete limit law involves the derivatives 


1 1 
Q'(w) = =e E'(w) =e", L'(w) = eae 
The last two cases precisely give rise to the classical Poisson and geometric law. The 
first case gives rise to the negative binomial law N B[2] which also appears in this 
form as a sum of two geometricly distributed random variables. 
The technical simplicity with which limit laws are pulled out of combinatorics is 
worthy of note. 


EXAMPLE IX.3. Root degrees in trees. Consider first the number of components in a sequence 


(ordered forest) of general Catalan trees. The bivariate OGF is 

1 1 
——.,, A(z) ==(1-v1—-42z). 
1—uh(z)’ (2) ( 2) 


2 
We have 7, = 1/2 < pg = 1, so that the composition schema is subcritical. Thus, for a forest 
of total size n, the number X,, of tree components satisfies 


it, PERRI. Yeseay: 


noo Dk+1 _ 


F(z,u) = 
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Since a tree is equivalent to a node appended to a forest, this asymptotic estimate also holds for 
the root degree of a general Catalan tree. 

Consider next the number of components in a set (unordered forest) of Cayley trees. The 
bivariate EGF is 

F(z,u) =e", h(z) = ze”, 
We have tT, = 1 < pg = +00, again a subcritical composition schema. Thus the number X,, 
of tree components in a random unordered forest of size n admits the limit distribution 
lim P{X, =k} =e '/(k—-1)!, (k > 1), 

a shifted Poisson law of parameter 1; asymptotically, the same property also holds for the root 
degree of a random Cayley tree 

The same method applies more generally to a simple variety of trees V (see Chapter VID) 
with generator ¢, under the condition of the existence of a root 7 of the characteristic equation 
(7) — T(r) = 0 at a point interior to the disc of convergence of ¢. The BGF satisfies 


V(z,u) = zb(uV(z)), V(z) =1-yvV1- /zp + O(1 - z/p). 


so that 


ud! (ur 
Vest) me, po(ur) — yD T= Tap 
The PGF of the distribution of root degree is accordingly 
ud’ (Tu koyt* 
o( Ts. tae ae 
oR Ao 
(A limit law was established directly under its local form in Chapter VII.) 
END OF EXAMPLE IX.3. 


The root degree in a random labelled nonplane tree (Cayley tree) admits in the 
asymptotic limit a Poisson law, while the root degree of a large plane tree (a Catalan 
tree) tends to a negative binomial (NV B[2]) distribution. Proposition IX.2 shows, in 
a precise technical sense, that the negative binomial law for Catalan trees is a direct 
reflection of planarity specified by a sequence construction, while the Poisson law 
arises from the set construction attached to nonplanarity. 
> IX.8. Bell number distributions. Consider the “set-of-sets” schema 


F(z,u) = exp(e"" — 1), 


assuming subcriticality. This corresponds to a scheme F = B($8>1(H)). Then the number x 
of components satisfies asymptotically a “derivative Bell” law: 


1 kSpr* —e7™—-T-1 
P(x a k) = K kl K=e ’ 
where S, = n[z"]ee —} is a Bell number. There exists parellel results: for sequence-of-sets, 


involving the surjection numbers; for set-of-sequences involving the fragmented permutation 
numbers. <q 


> IX.9. High levels in Cayley trees. The number of nodes at level 5 (i.e., at distance 5 from the 
root) in a Cayley tree has the nice PGF 


—l1+u 
F -iperite ft 
“pee e ite 


du 
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and thus involves “super Bell” numbers. <q 


A further direct application of continuity of PGFs is the distribution of the number 
of 7-components of a fixed size m in a composition [[H] with GF g(h(z)), again 
under the subcriticality condition. In the terminology of Chapter III, we are thus 
characterizing the profile of combinatorial objects, at least as regards components of 
some fixed size. The bivariate GF is then 


P(z,u) = g(h(z) + (u—Dhhm2™), 
with h,, = [2™]h(z). The singular expansion at z = p is 
F(z,u) = g(t +(u—1)hmp™) —c9'( + (u—Lhimp™)(1—2/p)>) +0((1-2/p)?). 
Thus, the PGF p,,(u) for objects of size n satisfies 


: _ g(t +(u=1hmp™) 
(13) jim, Pa(u) = aay 


Like before this specializes in the case of sequences, sets, and cycles giving a result 
analogous to Proposition IX.2. 


Proposition [X.4 (Fixed size components). Under the subcriticality conditions of 
Propositions IX.2 and IX.3, the number of components of a fixed size m in a random 
sequence, set, or cycle construction applied to a class with GF h(z) admits a discrete 
limit law. With hm := [z™|h(z), p the radius of convergence of h(z), and T := h(p), 
the distributions are as follows: 

For sequences, the limit law is a negative binomial (N B[2]) of parameter a = 

m 

a For sets, the limit law is Poisson with parameter X = hyp". For 
1—T+hmp™ 
Seas . hmp™ 
cycles, the limit is geometric of parameter a = ——————_—_.. 
1—Tt+hmp™ 
EXAMPLE IX.4. Root subtrees of size m. In a Cayley tree, the number of root subtrees of 
some fixed size ™ has, in the limit, a Poisson distribution, 


we " m™ le-™ 


PES Ee mal 


In a general Catalan tree, the distribution is a negative binomial N B[2] 
m2?2m-t 
(eae 
Generally, for a simple variety of trees under the usual conditions of existence of a solution to 
the characteristic equation, V = z¢(V), one finds “en deux coups de cuillére a pot”, 


pr = (1—a)?(k+1)a*, at =1+ 


V(z,u) = 26(V(z)+ Vnz”™(u—1)) 
V(z,u) p(T + Vinp™ (u— 1)) — pr9'(7 + Vinp™ (uw — 1) V1 — 2/p 
jimiepGR? = CA ee Ce) ae = 


(Notations are the same as in Example 3.) ..................005 END OF EXAMPLE IX.4. 
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Arbitrarily many schemas leading to discrete limit laws could be listed. Roughly, 
conditions are that the auxiliary variable wu does not affect the location nor the nature 
of the dominant singularity of F'\(z,u). Such conditions are met by the subcritical 
schemas, since eventually the auxiliary variable only appears as a multiplicative coef- 
ficient in a local singular expansion. 
> IX.10. The product schema. Define 

F(z,u) = A(uz)- B(z), 


that corresponds to a product construction, F = A x B, with u marking the size of the A- 
component in the product. Assume that the radii of convergence satisfy p4 > pp and that 
B(z) has a unique dominant singularity of the algebraic-logarithmic type. Then, the size of the 
A component in arandom F structure has a discrete limit law with PGF, 


_ A(pu) 
AO). 


The proof results directly from singularity analysis. Alternatively, an elementary proof can be 
given based on the weaker requirement that the coefficient of B satisfy bn41/bn ~ pp’. <l 


Regarding the number of components, the case of a supercritical composition 
leads to continuous limit laws of the Gaussian type, as we shall see in the next sec- 
tions. The critical case may lead to a variety of probabilistic laws due to the confluence 
of singularities that then manisfests itself. In the example that follows, we show that a 
particular critical composition scheme already studied in Chapter VII leads to a collec- 
tion of Poisson laws describing the small component profile of composite structures. 


EXAMPLE IX.5. Small components in sets of logarithmic structures. Consider first the exp— 
log schema in the simpler labelled case: it is corresponds to the construction F = B(G), 
that is, F'(z,u) = exp(uG(z)) under the assumption that G(z) is logarithmic. This means 
(Chapter VID) that G'(z) is A-singular and satisfies locally 


G(z) = KL(z/p) +A + n(2), where L(z) :=log(1—z)~', 
and n(z) = O(1/L(z/p)”) as z — pina A domain. We already know from Chapter VII 
that the number of components has mean and variance each of the order of log n, so that a 
discrete limit law is not to be expected for the total number of components. However, the 
situation becomes quite different if fixed size components are considered. A limit distribution 
has already been obtained in Chapter VII under its local form and it may be revisited in the light 


of methods of the present chapter as follows. Let m be a fixed integer larger than 1. The BGF 
of F objects with wu marking the number of m components is 


F(z,u) = exp ((u— 1)grz"). 


Under the logarithmic assumption, one has for any u in a small neighbourhood of 1 as z — p 
in a A-domain: 


F(z,u) ~ ew(u)(1—2/p)™, — w(u) = exp ((w— Lgrp"). 
By singularity analysis, this tells us that the number of m—components in a random F-structure 
of large size tends to a Poisson distribution with parameter  := grp". 

This result applies for any m less than some arbitrary fixed bound B. In addition, tru- 
ely multivariate methods discused at the end of this chapter enable one to prove that the 
the number of components of sizes 1,2,...,B are asymptotically independent. This gives a 
very precise model of the probabilistic profile of small components in random F-objects as 
a product of independent Poisson laws of parameter g,p" for r = 1,...,B. Similar results 
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FIGURE IX.5. Small components of size < 20 in random permutations (left) 
and random mappings (right) of size 1,000: each object corresponds to a line and 
each component is represented by a square of proportional area. 


hold for unlabelled multisets, but with the negative binomial law replacing the Poisson law. 
END OF EXAMPLE IX.5. 


The previous example covers well known exp-log structures introduced by Flajo- 
let and Soria in [210]. In the labelled case, we have permutations (as sets of cycles), 
random mappings and 2-regular graphs (as sets of connected components). A render- 
ing of the cycle structure of random permutations already appears in Chapter III; see 
also Figure 5. In the unlabelled case, the prime example is that of polynomials over 
finite fields to which we return later in this chapter. 

In contrast, Jarge component sizes cannot be independently distributed. (E.g., a 
permutation can have only cycle one larger than n/2, two cycles larger than n/3, etc.) 
A general probabilistic theory of the joint distribution of largest components in exp— 
log structures has been developed by Arratia, Barbour, and Tavaré [16] and some of its 
developments draw their inspiration from earlier studies conducted under the analytic 
combinatorial angle. This joint distribution of large components can be characterized 
in terms of what is known as the Poisson—Dirichlet process. For instance, as shown by 
Gourdon [246], the largest component itself involves the Dickmann function otherwise 
known to describe the distribution of the largest prime divisor of a random integer over 
a large interval of the form [1 .. N]. 


[> IX.11. Random mappings. The number of components of some fixed size m in a large 
random mapping (functional graph) is asymptotically Poisson(A) where A = K,,e~™/m! and 
Km = m![z™]log(1 — T)~* enumerates connected mappings. (There T is the Cayley tree 
function.) The fact that Kme~™/m! % 1/(2m) explains the fact that small compoents are 
somewhat sparser for mappings than for permutations. <q 
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FIGURE IX.6. Walks, excursions, bridges, and meanders: random samples of 
length 50. 


As a last example here, we discuss the length of the longest initial run of a’s in 
random binary words satisfying various types of constraints. This discussion com- 
pletes the informal presentation of Section IX. 1. The basic combinatorial objects are 
the set W = {a, b}* of binary words. A word w € {W} can also be viewed as de- 
scribing a walk in the plane, provided one interprets a and b as the vectors (+1, +1) 
and (+1,—1) respectively. Such walks in turn describe fluctuations in coin tossing 
games, as described by Feller [161]. What is especially interesting here is to observe 
the complete chain where a specific constraint leads in succession to a combinatorial 
decomposition, a specific analytic type of BGE, and a local singular structure that is 
then reflected by a particular limit law. 


EXAMPLE IX.6. Initial runs in random walks. We consider here walks in the right half plane 
that start from the origin and are made of steps a = (1,1), b = (1,—1). According to the 
discussion of Chapters V and VII, one can distinguish four major types of walks (Figure 6). 


e Unconstrained walks (W) corresponding to words and freely described by W = 
6(a, b); 

e Dyck paths (D) that always have a nonnegative ordinate and end at level 0; the 
closely related class G = Db represents the collection of gambler’s ruin sequences. 
In probability theory, Dyck paths are also refereed to as excursions. 

e Bridges (6) that are walks that may have negative ordinates but must finish at level 0. 

e Meanders (M) which have have always a nonnegative altitude and may end at an 
arbitrary nonnegative altitude. 


The parameter y of interest is in all cases the length of the longest initial run of a’s. 
First, the unconstrained walks obey the decomposition 
W = 6(a)6(65(a)), 
already employed in Chapters I and IV. Thus, the BGF is 


1 1 


Wa) 1—zul—2z(1—2z)7)" 


By singularity analysis of the pole at p = 1/2, the PGF of x on random words of W,, satisfies 
1 


~ —2_ 
Pn(u) i= g? 
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and, as expected, this corresponds to a limit geometric law of parameter 4. This is the first 
example presented (Section IX. 1) in order to introduce discrete limit laws. 

As it is well-known, Dyck paths D play an important rdle in combinatorial constructions 
related to lattice paths (Chapters I and V). A sequence decomposes into “arches” that are 
themselves Dyck paths encapsulated by a pair a, b, 


D = G(aDb), 
which yields a GF of the Catalan domain, 


1 1— V1 — 42? 


sa 1— 22D(z)’ Bee 22? 


In order to extract the initial run of a’s, we observe that a word whose initial a-run is a” con- 
tains k components of the form 6D. This corresponds to a decomposition in terms of the first 
traversals of altitudes k — 1,...,1,0, 


D= >" "OD" 


k>0 
illustrated by the following diagram: 

Thus, the BGF is 

1 
D = ——_—_.. 
ee) 1— z2uD(z) 
This is an even function of z. In terms of the singular element, 6 = (1 — 4z)'/?, one finds 
2 2u 


Fyi(z/?,u) 55 + O(5"), 


i, (2—u) 
as z — 1/4. Thus, the PGF of y on random words of D2, satisfies 


Pon(u) ~ Q—up’ 


which is the PGF of a negative binomial N B[2] of parameter 3 shifted by 1. (Naturally, in this 
case, explicit expressions for the combinatorial distribution are available, as this is equivalent 


to the classical ballot problem.) 
A bridge decomposes into a sequence of arches, either positive or negative, 


B=G(aDb+bDa), 
where D is like D, but with the réles of a and b interchanged. In terms of OGFs, this gives 
= iE = 1 
1—222D(z) V1 — 422 
The set B* of nonempty walks that start with at least one a admits a decomposition similar to 
that D, 


Biz) 


B*(z) =| S_a*o(Db)*" | - B, 


k>1 
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since the paths factor uniquely as a D component that hits O for the first time followed by a B 


oscillation. Thus, 
2 


z 
Bt (z) = —————B(z). 
(2) = apy BO) 
The remaining cases B~- = B\ B* consist of either the empty word or of a sequence of positive 
or negative arches starting with a negative arch, so that 
2 
2 z D(z) 
B = 1+ —_;—_.. 
(2) s 1 — 2z2D(z) 
The BGF results from these decompositions: 
2 2 
ZU z° D(z) 
B = —.—B 1+ ——_;—.. 
(2) =o upa Tee) 


Again, the singular expansion is obtained mechanically, 
1 1 


BN a) = Ere ame 


where 6 = (1 — 4z)1/?, Thus, the PGF of 1 on random words of 62, satisfies 


1 
2-w 


pan(u) ~ 


The limit law is geometric of parameter 1/2. 

A meander decomposes into an initial run a®, a succession of descents with their compan- 
ion (positive) arches in some number ¢ < k, and a succession of ascents with their correspond- 
ing (positive) arches. The computations are similar to the previous cases, more intricate, but 
still “automatic”. One finds that 

2 
eg 
-x)a-Y) @Q-xY)Q-Y)/1-Y 1-x 


with X = zu, Y = zW1(z), so that 
l—u-—224+2uz?4+(u-1)V1—-427 
(1 — zu) (1-22- V1 — 42?) (2-u+uv1— 427) 


There are now two singularities at z = +4, with singular expansions, 


uVv2 1 4—u 
z—1/2 (2 _ u)? y/1 — 2z ¥ ( ); (z,u) z—>—-1/2 4— u? 


so that only the singularity at 1/2 matters asymptotically. Then, we have 


M(z,u) = 2 


M(z,u) + 0(1), 


u 
Pn(u) ~ Q—up’ 
and the limit law is a shifted negative binomial N B[2] of parameter 1/2. In summary: 


Proposition IX.5. The length of the initial run of a’s in unconstrained walks and bridges is 
asymptotically distributed like a geometric; in Dyck excursions and meanders like a negative 


binomial N B{2]. 


Similar analyses can be applied to walks with a finite set of steps [21]. 
END OF EXAMPLE IX.6. 
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> ¥X.12. The number of meanders. A meander uniquely decomposes into an excursion 
followed by a (possibly empty) sequence of elements of the form aD. There results that 
M(z) = D(z)/(1 — zD(z)), and 


Mee V1—4z2 —-142z 
a 22(1 — 2z) ; 
so that M, = (inyaj)- J 


> IX.13. Leftmost branch of a unary-binary (Motzkin) tree. The class of unary-binary trees 
(or Motkzkin trees) is defined as the class of unlabelled rooted plane trees where (out)degrees 
of nodes are restricted to the set {0, 1,2}. The parameter . under consideration is the length 
of the lefmost branch measured by the number of nodes it contains. A tree can be viewed as 
a leftmost branch at each node of which is grafted either nothing (the node has degree 1) or a 
tree, except for the last node on the branch. Hence the decomposition and the BGF: 


M@)=éMe, Mew =a. 


The first equation corresponds to M = z(1+ M+ M7)asit should. The dominant singularity 
is at z = 1/3 where M(4) = 1. There results that the limit PGF of x is 4u/(3 — u)?. The 
limit distribution is a negative binomial N B[2] with parameter 3, shifted by 1. dq 


IX. 4. Continuous limit laws 


Throughout this chapter, our goal is to quantify sequences of random variables 
X,, that arise from an integer valued combinatorial parameter . defined on a combi- 
natorial class F. Itis a fact that, when the mean ju, and the standard deviation o,, of x 
on f,, tend to infinity as n gets large, then a continuous limit law usually holds. That 
limit law arises not from the _X,, themselves (as was the case for discrete-to-discrete 
convergence in the previous section) but from their standardized versions: 


Xn, — En 
Oar ¢ 


Xt = 


In this section, we provide definitions and major theorems needed to deal with the 
discrete-to-continuous situation. 


A random variable Y specified by its distribution function, 
P{Y <x} = F(z), 


is said to be continuous if F(x) is continuous (see APPENDIX C: Random variables, 
p. 717). In that case, F'(x) has no jump, and there is no single value in the range of Y 
that bears a nonzero probability mass. If in addition F(z) is differentiable, the random 
variable Y is said to have a density, g(x) = F’ (x), so that 


P(Y <a)= i) g(x) da, Pla <Y <a2+4+ dz} = g(x) de. 


—oCo 


A particularly important case for us here is the standard Gaussian or normal distribu- 


tion function, 
1 : 2 
O(x) = — e-¥ [2 dw, 
(2) ee 
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also called the error function (erf), the corresponding density being 


Hes 8 Gi oer 2 
Von 
This section and the next ones are relative to the existence of limit laws of the con- 
tinuous type, with Gaussian limits playing a prominent réle. The general definitions 
of convergence in law (or in distribution) and of weak convergence (see APPENDIX C: 
Convergence in law, p. 722) instantiate as follows. 


Definition IX.3 (Discrete-to-continuous convergence). Let Y be acontinuous random 
variable with distribution function Fy(ax). A sequence of random variables Y;, with 
distribution functions Fy, (x) is said to converge in distribution to Y if, pointwise, for 
each x, 


lim Fy, (x) = Fy(a). 
n—0o 
; D D 
In that case, one writes Y,—=>Y and Fy, => Fy. 
Convergence is said to take place at speed €,, if 


sup | Fy, (a) — Fy (x)| < en. 
xcER 


The definition does not a priori require uniform convergence. It is a known fact 
that convergence to a continuous limit is always uniform. This uniformity means that 
there always exists a speed €,, that tends to 0 as n — oo. 

Discrete limit laws can be established via convergence of probability generating 
functions to a common limit, as asserted by the continuity theorem for PGFs, Theo- 
rem IX.1. In the case of continuous limit laws, one has to resort to integral transforms 
(see APPENDIX C: Transforms of distributions, p. 718), whose definitions we now 
recall. 


— The Laplace transform —also called the moment generating function— 
Ay (s) is defined by 


+00 
Ay (s) := Efe®*} = / e°* dF (2). 


— the Fourier transform —also called the characteristic function— dy (t) is 
defined by 


+00 
dy (t) = E{e**} = / e? dF (a). 


(Integrals are taken in the sense of Lebesgue-Stieltjes or Riemann-Stieltjes; cf AP- 
PENDIX C: Probability spaces and measure, p. 715.) 

There are two classical versions of the continuity theorem, one for characteris- 
tic functions, the other for Laplace transforms. Both may be viewed as extensions 
of the continuity theorem for PGF’s. Characteristic functions always exist and the 
corresponding continuity theorem gives a necessary and sufficient condition for con- 
vergence of distributions. As they are a universal tool, characteristic functions are 
therefore often favoured in the probabilistic literature. In the context of this book, 
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strong analyticity properties go along with combinatorial constructions so that both 
transforms usually exist and can be put to good use. 


Theorem IX.4 (Continuity of integral transforms). Let Y,Y,, be random variables 
with Fourier transforms (characteristic functions) $(t), dn (t), and assume that Y has 
a continuous distribution function. A necessary and sufficient condition for the con- 


vergence in distribution, vy. is that, pointwise, for each real t, 
lim ¢,,(t) = ¢(t). 
n—-coO 


Let Y, Y;, be random variables with Laplace transforms X(s), Xn(s) that exist in 
a common interval |—s9, So]. If, pointwise for each real s € [—S9, 80], 


lim An(s) = X(s), 


then the Y,, converge in distribution to Y: Y,->Y. 

The first part of this thorem is also known as Lévy’s continuity theorem for char- 
acteristic functions. 
PROOF. See Billingsley’s book [55, Sec. 26], for Fourier transforms, and [55, p. 408], 
for Laplace transforms. 
> IX.14. Laplace transforms need not exists. Let Y, be a mixture of a Gaussian and a Cauchy 


distribution: 
1 ie edt ke 1 [* dw 
P(Y, <2)={1--— d — —,. 
ead ( 2 oo Vin cae ea 
Then Y,, convergences in distribution to a standard Gaussian limit Y, though A, (s) only exists 
for R(s) = 0. <q 


The continuity theorem for PGFs eventually relies on continuity of the Cauchy 
coefficient formula that realizes the inversion needed in recovering coefficients from 
PGFs. Similarly, the continuity theorem for integral transforms may be viewed as 
expressing the continuity of inverse Laplace or Fourier transforms, this in the specific 
context of probability distribution functions. 

The next theorem is an effective version of the Fourier inversion theorem that 
proves especially useful for characterizing speeds of convergence. It bounds in a 
constructive manner the sup-norm distance between two distribution functions by a 
special metric distance between their characteristic functions. Recall that || f ||. := 


suprer |f(2)|. 

Theorem IX.5 (Berry-Esseen inequality). Let F',G be distribution functions with 
characteristic functions $(t), y(t). Assume that G has a bounded derivative. There 
exist absolute constants C1, C2 such that for any T > 0, 


+T = 
IF — Glllco <a f ne 


PROOF. See Feller [162, p. 538] who gives 


1 24 
Ao. 5 
Tv 


Reales 


dt + ce T 


as possible values for the constants. 
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FIGURE IX.7. The standardized distribution functions of the binomial law 
(top), the corresponding Fourier transforms (middle), and the Laplace trans- 


forms (bottom), for n = 3,6,9,12,15. The distribution functions centred 
around the mean yp, = n/2 and scaled according to the standard devia- 
tion on = ni/2 /2 converge to a limit which is the Gaussian error function, 


1 x 
P(x) = Wor ' eo /2 dw, Accordingly, the corresponding Fourier trans- 


. . . —_ 2 . 
forms —or characteristic functions— converge to ¢(t) = e7* /? while the 
Laplace transforms —or moment generating functions— converge to A(s) = 

s?/2 
e !*. 


This theorem is typically used with G being the limit distribution function (often 
a Gaussian for which ||G’ ||. = (27)~!/?) and F = F,, a distribution that belongs to 
a sequence converging to G. The quantity T’ may be assigned an arbitrary value; the 
one giving the best bound in a specific application context is then normally chosen. 


> IX.15. A general version of Berry—Esseen. Let FG be two distributions functions. Define 
Lévy’s “concentration function”, Qg(h) := sup, (G(a+h)—G(za)), h > 0. There exists 
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an absolute constant C’ such that 


1 +7 
||F — Glo < CQa(z) + cf 
oa 


=) | we 


See Elliott’s book [151, Lemma 1.47] and the article by Stef and Tenenbaum for a discus- 
sion [452]. The latter provides inequalities analogous to Berry-Esseen, but relative to Laplace 
transforms on the real line (bounds tend to be much weaker due to the smoothing nature of the 
Laplace transform). <q 


Large powers and the central limit theorem. The binomial distribution is defined 
as the distribution of a random variable X,, with PGF 


pot) = (545) | 


and characteristic function, d,(¢) = pn(e’). The mean is 4, = n/2 and the vari- 
ance is 07 = n/4. Therefore, the standardized variable X* = (Xn — Un)/on has 
characteristic function 


(14) $2 (t) = E(e**) = (cos) : 


The asymptotic form is directly found by taking logarithms, and one finds 


(15) logd3(t) <nbog (1-4) =F 4012) 
“ 2n =6n? 2 nn? 

pointwise, for any fixed t, as n — oo. This establishes convergence to the Gaussian 
limit. In addition, the Berry-Esseen inequalities show that the speed of convergence is 
O(n-V 2). a fact that is otherwise easily verified directly using Stirling’s formula. 
> IX.16. De Moivre’s Central Limit Theorem. Characteristic functions extend the normal limit 
law for unbiased Bernoulli distributions to the general case with PGF (p + qu)”, for fixed p, q 
with p + q = 1. (The result is accessible directly from Stirling’s formula, which constitutes De 
Moivre’s original derivation.) <q 

The central limit theorem (CLT, then term was coined by Pélya in 1920, originally 
because of its “zentralle Rolle” in probability theory) of probability theory expresses 
the Gaussian character of sums of random variables. It was first discovered* in the 
particular case of Bernoulli variables by De Moivre. The general version is due to 
Gauss (who, around 1809, had realized from his works on geodesy and astronomy 
the universality of the “Gaussian” law but had only unsatisfactory arguments) and to 
Laplace (in the period 1812—1820). Laplace in particular uses Fourier methods and his 
formulation of the CLT is fully general, though some of the precise validity conditions 
of his arguments only became apparent a century later. 


Theorem IX.6 (Basic CLT). Let T; be independent random variables supported by 
Z> with a common distribution of (finite) mean 1 and (finite) standard deviation o. 


4For a perspective on historical aspects of CLT, we refer to Hans Fischer’s well-informed mono- 
graph [167]. 
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Let Sp, := Ty +-+-++ Tp. Then the standardized sum S> converges to the standard 
normal distribution, 
ge PS WO Dei: 
on 
PROOF. The proof is based on local expansions of characteristic functions. First, by a 
general theorem, the existence of the first two moments implies that 


1 
or(t) = 1+ ipt — su? + 0°)? + o(€*), t— 0. 


By shifting, it suffices to consider the case of zero-mean variables (tu = 0). We then 
have, pointwise for each t as n — on, 


def Bey Se E ag) eee 
— es = — — —= 
2 a/n On On © : 


like in Equations (14) and (15). The conclusion follows from the continuity theorem. 
(This theorem is in virtually any basic book on probability theory, e.g., [162, p. 259] 
or [55, Sec. 27].) 

The central limit theorem in the independent case is the subject of Petrov’s com- 
prehensive monograph [394]. There are many extensions of the CLT, to variables that 
are independent but not necessarily identically distributed (the Lindeberg—Lyapunov 
conditions) or variables that are only dependent in some weak sense (mixing condi- 
tions); see the discussion by Billingsley [55, Sec. 27]. In the particular case where the 
T’s are discrete, a stronger “local” form of the Theorem results from the saddle point 
method; see our discussion in Chapter VIII, the classic treatment by Gnedenko and 
Kolmogorov [237], and extensions in Section IX. 9. 


> IX.17. Poisson distributions of large parameter. Let X be Poisson with rate A. As \ tends to 
infinity, Stirling’s formula provides easily convergence to a Gaussian limit. The error terms can 
then be compared to what the Berry-Esseen bounds provide. (In terms of speed of convergence, 
such Poisson approximations to combinatorial distributions are sometimes of a better quality 
than the standard Gaussian law; see Hwang’s comprehensive study [276] for a general analytic 
approach.) 


IX. 5. Quasi-powers and Gaussian limits 


The central limit theorem of probability theory admits a fruitful extension in the 
context of analytic combinatorics. As we now show, it suffices that the PGF of a 
combinatorial parameter behaves nearly like a large power of a fixed function to en- 
sure convergence to a Gaussian limit. We first illustrate this point by considering the 


Stirling cycle distribution. 
EXAMPLEIX.7. The Stirling cycle distribution. Consider the Stirling cycle numbers (7 , and 


let X,, be the corresponding random variable with probability distribution [7] , with PGF, 


c= (Feo?) eee rates T(u+n) 


n 


n! ~ T(w)P(n +1) 


We have for fixed wu near 1, 


ae. | ae) 7 (14012) = Tw (aye (14012), 
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As results from Stirling’s formula for the Gamma function (or from singularity analysis of 
[2"](1 — z)~™, Chapter VI), the error term in (16) is O(n~*) when w stays in a small enough 
neighbourhhod of 1, for instance |u — 1| < 3. Thus, as n — +00, pn(w) is approximately a 
“large power” of e“— taken with exponent log n, multiplied by a fixed function, ([(w))~+. By 
analogy to the central limit theorem, we may expect a Gaussian law. 

The mean satisfies in = logn+-y+o(1), the standard deviation satisfies on = /logn+ 
o(1). We thus consider the standardized random variable, 


Xn—-L-+y 


Xn = 
VL 


L=logn, 


whose characteristic function is 


en it(h/? +yL~"/?) 


o2(t) = exp (Le = 1)) (1 “fe o(+)) 


For fixed t, with L — oo, the logarithm is then found mechanically to satisfy 


T(ett/VE) 


2 
log 63(t) = — + O((logn)-"/*). 


This is sufficient to establish a Gaussian limit law, 
(17) lim P{X, < logn +7 +xv/logn} = — | ew)? ay: 
n—oo Jar Hs 


Proposition [X.6 (Goncharov’s Theorem). The Stirling cycle distribution, P(X», = k) = 
iL 
n! 


size n is asymptotically normal. 


(7 , describing the number of cycles and the number of records in a random permutation of 


This result was obtained by Goncharov as early as 1944, see [240], albeit without an error 


term as his investigations predate the Berry-Esseen inequalities. END OF EXAMPLE IX.7. 


The cycle example is characteristic of the occurrence of Gaussian laws in analytic 
combinatorics. What happens is that the approximation (16) by a power with “large” 
exponent (,, = logn leads after normalization, to the characteristic function of a 
Gaussian variable, namely e-©/2. From there, the limit distribution (17) results by 
the continuity theorem. This is in fact a very general phenomenon, as demonstrated 
by a theorem of Hsien-Kuei Hwang [272, 275] that we state next and that builds upon 
earlier statements of Bender and Richmond [36]. 

The following notations prove especially convenient: given a function f(u) ana- 
lytic at u = 1, we set 


(18) m(f) = 


v(f) = PO) FQ) (ey* 


fQ) FQ) f() 
The notations m, v suggest their probabilistic counterparts while neatly distinguishing 
between the analytic and probabilistic realms: If f is the PGF of a random variable X, 


then f(1) = 1 and m(f), the mean, coincides with the expectation E(X ); the quantity 
o(f) then coincides with the variance V(X). 


Theorem IX.7 (Quasi-Powers, Central law). Let the X,, be nonnegative discrete ran- 
dom variables with probability generating function p,,(u). Assume that, uniformly in 
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a fixed complex neighbourhood of u = 1, for sequences Bn, kn — +00, there holds 


(19) pa(u) = Au) (B(u))® (1 a o(~)) 


n 


where A(w), B(u) are analytic at u = 1 and A(1) = B(1) = 1. Assume finally that 
B(u) satisfies the so-called “variability condition”, 


b(B(u)) = B’(1) + BY(1) — B’(1) £0. 


Under these conditions, the distribution of X,, is asymptotically Gaussian, and the 
speed of convergence to the Gaussian limit is O(K,, + oe) 


p See coh <a) s0(L47e), 


BnU” (0) Bigs “AL Dn 


The mean and variance of X y, satisfy 


ee jin = E(Xn) = By im(B(u)) +m(A(u)) + O() 
o2 = V(Xn) = Bav(B(u)) +0(A(u)) + Ol 


This theorem is a direct application of the following lemma, also due to Hwang, 
that applies more generally to arbitrary discrete or continuous distributions, and is thus 
entirely phrased in terms of integral transforms. 


Lemma IX.1 (Quasi-Powers, general distributions). Assume that the Laplace trans- 
forms X,(s) = Ef{e®*} of a sequence of random variables X, are analytic in a disc 
|s| < p, for some p > 0, and satisfy there an expansion of the form 


1 
(21) Dail) SPE) (1+0(2)) ; 

Kin 
with Bn, kn — +oo asn — +00, and U(s), V(s) analytic in |s| < p. Assume also 
the variability condition, 


U" (0) £0. 
Under these assumptions, the mean and variance of Xp, satisfy 


E{Xn} = BnU'(0) + V'(0) + O(K,"), 


2) V{Xn} = BaU"(0) +V"(0) + O(Kz!). 


The distribution of Xp, is asymptotically Gaussian and the speed of convergence to the 
Gaussian limit is O(K,,1 + Ba), 
PROOF. This closely follows the lines of Hwang’s works [272, 275]. First, we estimate 


the mean and variance. The variable s is a priori restricted to a small neighbourhood 
of 0. By assumption, the function log X,,(s) is analytic at 0 and it satisfies 


log An(s) = Bull(s) + V(s) + OC) 


n 
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This asymptotic expansion carries over, with the same type of error term, to deriva- 
tives at 0 because of analyticity: this can be checked directly from Cauchy integral 
representations, 

1 d" 1 ds 

i ds" log An(s) ba => Fz | eel grtl? 
upon using a small but fixed integration contour yy and taking advantage of the basic 
expansion of log A,,(s). In particular, the mean and variance satisfy the estimates 


of (22). 
Next, we consider the standardized variable, 
Xn — B,U'(0 7 
= Fe EO ane) = Ble}. 
BU" (0) 
We have 6,07(0) 
U'(0 Ss 
log \* (s = —-— 5 + log ,, (————). 
Bn) =~ amy eam 


Local expansions to third order based on the assumption (21) show that 


= s| + |s|8 d. 


| 
Sh Oe +O), 


(23) log ¥3,(8) = 5 


uniformly with respect to s in a disc of radius O( at zy, and in particular in any fixed 
neighbourhood of 0. This is enough to conclude as regards convergence in distribution 
to a Gaussian limit, by the continuity theorem of either Laplace transforms (restrict- 
ing s to be real) or of Fourier transforms (taking s = it). 

Finally, the speed of convergence results from the Berry-Esseen inequalities. Take 
T=T,= cn! ze where c is taken sufficiently small but nonzero, in such a way that 
the local expansion of \,,(s) at 0 applies. Then, the expansion (23) instantiated at 
$s = it entails that the quantity 


Tn, 
An i= ‘| 
=T: 


An = OG. + ie Ns 


n 


* (jt) — —t?/2 1 
An(it) =e" dt + — 


Tn 


satisfies 


and the statement follows by the Berry-Esseen theorem. 
Theorem IX.7 applies immediately to the Stirling cycle distribution for which 
the estimate (16) was derived. It shows in addition that the speed of convergence is 
O((logn)~'/?) for this distribution. 
The Quasi-Powers Theorem under either form (19) or (21) can be read formally 
as expressing the distribution of a (pseudo)random variable 


Z=Yo+tW,+W2+---+W,,, 


where Yo “corresponds” to eV *) (or A(u)) and each W; to e4(*) (or B(u)). However, 
there is no a priori requirement that 3,, should be an integer, nor that e/(*), eV) 
be Laplace transforms of probability distribution functions. In a way, the theorem 
recycles the intuition that underlies the central limit theorem and makes use of the 
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analytic machinery behind it. But, in applications, functions like eU), e”) do not 
necessarily admit a direct probabilistic interpretation. 

It is of particular importance to note that the conditions of Theorem IX.7 and 
Lemma IX.1 are purely local: what is required is local analyticity of the quasi-power 
approximation at u = 1 for PGF’s or, equivalently, s = 0 for Laplace-Fourier trans- 
forms. This important feature is ultimately due the normalization of random variables 
and transforms that goes along with continuous limit laws 


> IX.18. Higher moments under Quasi-powers. Following Hwang [275], one has under the 
conditions of the Quasi-Powers Theorem, Lemma IX.1, and for each fixed k, 


B(XA) = klcon (Bn) +0 (ZA), con(s) = [sblenVO+. 


(wx is a polynomial of degree k, which describes precisely the behaviour of higher moments.) 


<J 


Singularity perturbation and Gaussian laws. The main thread of this chapter is 
bivariate generating functions. In general, we are given a BGF F(z, u) and aim at 
extracting a limit distribution from it. The quasi-power paradigm in the form (19) is 
what one should look for, in the case where the mean and the standard deviation both 
tend to infinity with the size n of the model. 

We proceed heuristically in this informal discussion. Start from the BGF and 
consider u as a parameter. If singularity analysis applies to the counting generating 
function F'(z, 1), it leads to an approximation, 


fn & C+ pr ?n®, 
where p is the dominant singularity of F(z, 1) and q is related to the critical exponent 
of F(z, 1) at p. A similar type of analysis is often applicable to F(z, wu) for u near 1. 
Then, it is reasonable to expect an approximation for the z-coefficients of the bivariate 
GF, 
f(u) & C(u)p(u)-"n9, 


In this perspective, the corresponding PGF is of the form 


Clu U ae a(u)—a 

Doone Ga no—a(l), 
C(1) \pQ) 

The strategy envisioned here is thus a perturbation analysis of singular expansions 

with the auxiliary parameter u being restricted to a small neighbourhood of 1. 


In particular if only the dominant singularity moves with u, we have a rough form 


Clu Oe ae 
Pn(u) © wy Grn ) ; 
C(1) \pQ) 
suggesting a Gaussian law with mean and variance that are both O(n). If only the 
exponent moves, then 


Cu) 
Pn(u) CM) 


suggests again a Gaussian law, but with mean and variance that are both O(log n). 


jo -B), 


2 
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These cases point to the fact that a rather simple perturbation of a univariate 
analysis may yield limiting Gaussian distributions. Each major coefficient extrac- 
tion method of Chapters [V—VIII plays a réle, and the present chapter illustrates this 
important point in the following contexts: 


— meromorphic analysis for functions with polar singularities (Section IX. 6 
below, based on a perturbation of methods of Chapters IV and V); 

— singularity analysis for functions with algebraic—logarithmic singularity 
(Section IX.7 below, based on a perturbation of methods of Chapters VI 
and VII); 

— saddle point analysis for functions with fast growth at their singularity (Sec- 
tion IX. 8 below, based on a perturbation of methods of Chapters VII). 


Roughly, the decomposable character of many elementary combinatorial struc- 
tures is reflected by strong analyticity properties of bivariate GF’s that, after perturba- 
tion analysis, lead, via the Quasi-Powers Theorem (Theorem IX.7), to Gausssain laws. 
The coefficient extraction methods being based on contour integration supply the nec- 
essary uniformity conditions. (In contrast, Darboux’s method or Tauberian theorems, 
being nonconstructive, are not normally applicable in this context.) 


IX.6. Perturbation of meromorphic asymptotics 


This section discusses schemas that rely on the analysis of coefficients of mero- 
morphic functions, as discussed in Chapters IV and V. It is largely based on works 
of Bender who, starting with his seminal article [28], was the first to propose abstract 
analytic schemas leading to Gaussian laws in analytic combinatorics. Our presenta- 
tion also follows subsequent works of Bender, Flajolet, Hwang, Richmond, and So- 
ria [36, 210, 212, 272, 273, 274, 275, 443]. 


EXAMPLEIX.8. The surjection distribution. We revisit the distribution of image cardinality in 
surjections for which the concentration property has been established in Chapter V. This exam- 
ple serves to introduce bivariate asymptotics in the meromorphic case. Consider the distribution 
of image cardinality in surjections, with BGF 

1 
1—u(e? — 1)’ 


he 
10° 


F(z,u) = 


Restrict u near 1, for instance |u — 1] < 
meromorphic with singularities at 


The function F'(z,u), as a function of z, is 


p(u) + 2ikr, p(u) = log(1+ “). 


The principal determination of the logarithm is used (with p(w) near log 2 when wu is near 1). It 
is then seen that p(w) stays within 0.06 from log 2, for |u — 1] < 7. Thus p(w) is the unique 
dominant singularity of F’, the next nearest one being p(w) + 2im with modulus certainly larger 
than 6. 

From the coefficient analysis of meromorphic functions (Chapter IV), the quantities 


fn(u) = [z”] F(z, wu) are estimated as follows, 


= —n—-1 1 dz 
fn(u) = Res (F(z, u)z Veet + oe tee F(z,u) re 


1 
up(u)er™) p 


(24) 
(u)7” + O(5-”). 
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It is important to note that the error term is uniform with respect to u, once u has been con- 
strained to satisfy |w — 1] < 0.1. This fact derives from the coefficient extraction method, 
since, in the remainder Cauchy integral of (24), the denominator of F'(z, u) stays bounded 
away from 0. 

The second estimate in Equation (24), constitutes a prototypical case of application of the 
quasi-power schema. Thus, the number X,, of image points in a random surjection of size n 
obeys in the limit a Gaussian law. The local expansion of p(w), 


_ 1 3 
p(u) = log(1+u_') =log2— gu — 1) + gltu— 1? 4..., 

yields 

(1) 1 3In(2) — 2 2 ; 

——=1 -—1)- —— (u- 1 —1 

p(w) a 2 log 2 ey) 8(log 2)? ee Oe )) 
so that the mean and standard deviation satisfy 

1 1 —log2 
Ln ~ Cin, On™ Co n, Cy i Dog 2’ a A(log 2)?" 


In particular, the variability condition is satisfied. Finally, one obtains, with ® the Gaussian 
error function, 
1 
P{Xn < Cin+aVCon} = O(x) +O (=) 3 
Jn 
This estimate can alternatively be viewed as a purely asymptotic statement regarding Stirling 
partition numbers. 
Proposition IX.7. The surjection distribution defined as {i} with Sn = Yo, kI{ 7} the 
normalizing factor (the surjection number), satisfies uniformly for all real x, 


1 n 1 2 1 
= \~ k! ——— et tie oO Vi. 
VCan 20 —oo Jn 


n 
k<Cyn+a 


This result already appears in Bender’s foundational study [28], . END OF EXAMPLEIX.8. 


The following analytic schema vastly generalizes the case of surjections. It is 
again strongly inspired by the works of Bender [28]. 


Theorem IX.8 (Meromorphic schema). Let F'(z,u) be a bivariate function that is 
bivariate analytic at (z,u) = (0,0) and has nonnegative coefficients there. Assume 
that F(z,1) is meromorphic in z < r with only a simple pole at z = p for some 
positive p <r. Assume also the following conditions. 

(i) Meromorphic perturbation: there exists © > 0 andr > p such that in the 
domain, D = {\z| <r} x {Ju—1| < e}, the function F(z, u) admits the 
representation 

B(z,u) 

C(z,u)’ 
where B(z,u), C(z, u) are analytic for (z,u) € D with B(p, 1) 4 0. (Thus 
p is a simple zero of C(z, 1).) 

(it) Nondegeneracy: one has 0,C(p,1)-OuC(p,1) 4 0, ensuring the existence 
of a nonconstant p(u) analytic at u = 1, such that C(p(u), wu) = and p(1) = 
p. 


F(z,u) = 
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(iti) Variability: one has 


Then, the random variable with probability generating function 


_ [2"|F(,u) 
Pa(4) = alFG, 1) 


converges in distribution to a Gaussian variable with a speed of convergence that is 
O(n-/?), The mean and the standard deviation of X, are asymptotically linear in n. 


First we offer a few comments. Given the analytic solution p(u) of the implicit 
equation C(p(u), w) = 0, the PGF E(u*”) satisfies a quasi-powers approximation of 
the form A(u)(p(1)/p(u))”, as we prove below. The mean jz, and variance o? are 


then of the form 
p(1) 2 p(1) 
(25) [bn =m (40) n+O(1), oa, =v { —~] n+ O(1). 
p(u) p(u) 
The variability condition of the Quasi-Powers Theorem is precisely ensured by condi- 
tion (iz). Set 


C5 = Fag C( : ies 


The numerical coefficients in (25) can themselves be solely expressed in terms of 
partial derivatives of C(z, u) by series reversion, 
(26) 


2 2 

C01 C7,9C0,2 — 2€1,0€1,1C0,1 + €2,0€9,1 

pc) = p— 2 (y— 1) — Banta = Fenornaton ¥ oe. 
C1,0 


2 3 
2, (u—1)°+O((u-1)”). 
In particular the fact that p(w) is nonconstant, analytic, and a simple root corresponds 
to co,1¢1,0 # 0 (by the analytic Implicit Function Theorem). The variance condition 
is then computed to be equivalent to the cubic inequality in the c;, ;: 


2 2 2 2 
(27) PC1,0°Co,2 — PC1,0C1,1C0,1 + P €2,0€0,1° + €0,1°C1,0 + €0,161,0° p F 0. 


PROOF. We can now proceed with asymptotic estimates. Consider a domain |u— 1] < 
6 inside the region of analyticity of B, C. Then, one has 


. 1 dz 
fa(u) = Le" Feu) = b Pew. 
where the integral is taken along a small enough contour encircling the origin. We 
use the analysis of polar singularities described in Chapter IV, exactly like in (24). As 
F(z, u) has at most one (simple) pole in |z| < r, we have 


— B(z,u) —n—-1 1 dz 
(28) fn(u) = Res (ae: = + ies i P(2,w 


where we may assume uw suitably restricted by |u — 1| < 6 in such a way that |r — 
p(u)| < 9(r — p)- 
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The modulus of the second term in (28) is bounded from above by 


K SUP) sJ=r Ju B(z,u 
(29) ah: wheté Ree Pj z|=r,|u i<o | ( iM 
rr inf) 2|=r,Ju—1|<6 IC(z, u)| 

Since the domain |z| = r, |u — 1] < dis closed, C(z, wu) attains its minimum that must 
be nonzero, given the unicity of the zero of C. At the same time, B(z,w) being 
analytic, its modulus is bounded from above. Thus, the constant /¢ in (29) is finite. 

A residue computation of the first term, in accordance with the analysis of mero- 
morphic functions, then yields 

B(o(u), uv) 


Pk) = Grou), xt) 
uniformly for wu in a small enough fixed neighbourhood of 1. The mean and variance 
then satisfy (25), with the coefficient in the leading term of the variance term that is, 
by assumption, nonzero. Thus, the conditions of the Quasi-Powers Theorem in the 
form (19) are satisfied, and the law is Gaussian in the asymptotic limit. 


p(uy ™* + O(r-), 


Some form of condition regarding nondegeneracy is a necessity. For instance, the 


functions 
1 1 1 1 


1-2’ l-zuw’ l-2zu2?) 1-—2?u’ 
each fail to satisfy the nondegeneracy and the variability condition, and the variance 
of the corresponding discrete distribution is identically 0. The combinatorial variance 
is O(1) for a related function like 


1 1 
F = O_O 
ety) 1—z(u+2)4+222u  (1—2z)(1— zu)’ 
which is excluded by the variability condition of the theorem—there a discrete limit 
law, a geometric, is known to hold; see page 572. Yet another situation arises when 


considering 
1 


(1— z)(1— zu)’ 

There is now a double pole at 1 when wu = 1 that arises from “confluence” at u = 1 of 
two analytic branches p;(u) = 1 and p2(u) = 1/u. In this particular case, the limit 
law is continuous but non-Gaussian; in fact, this limit is the uniform distribution over 
the interval [0, 1], since 


F(z,u)=1l+21+u+2(ltutw)+ 2 tutwtus)+---. 


In addition, for this case, the mean is O(n) but the variance is O(n”). Such situations 
are briefly examined in Section IX. 11 at the end of this Chapter. 
> IX.19. Higher order poles. Under the conditions of Theorem IX.8, a limit Gaussian law holds 


for the distributions generated by the BGF F(z, u)™, which has an mth order pole. See [28]. 
<J 


EXAMPLE IX.9. The Central Limit Theorem and discrete renewal theory. Let g(u) be any 
PGF (g(1) = 1) of a random variable supported by Z>o that is analytic at 1 and nondegenerate 
(i.e., 0(g) > 0). Then 


F(z,u) = 


1 


F(z,u) = 1— zg(u) 
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has a singularity at 1/g(w) that is a simple pole, 


Theorem IX.8 then applies to give a weak form of the central limit theorem for discrete prob- 
ability distributions with PGFs that are analytic at 1. (In such a case, a refined Gaussian con- 
vergence property—a local limit law, see Chapter VIII and Section IX. 9 below—also derives 
from the saddle point method.) 
Under the same analytic assumptions on g, consider now the “dual” BGF, 
1 

1—ug(z)’ 
where the réles of z and wu have been interchanged. In addition, we must impose for consis- 
tency that g(0) = 0. There is a simple probabilistic interpretation in terms of renewal processes 
of classical probability theory. Assume a light bulb has a lifetime of m days with probabil- 
ity gm = [z™]g(<z) and is replaced as soon as it ceases to function. Let X» be the number 
of light bulbs consumed in n days assuming independence, conditioned upon the fact that a 
replacement takes place on the nth day. Then the PGF of X,, is [z”|G(z, u)/[z”]G(z, 1). 
(The normalizing quantity [z"]G(z, 1) is precisely the probability that a renewal takes place on 
day n.) Theorem IX.8 applies. The function G has a simple dominant pole at z = p(w) such 
that g(p(u)) = 1/u, with p(1) = 1 since g is by asumption a PGF. One finds 

Dp Bg gpd Cees ay 20)? 

p(w) g'(1) 2 g'(1)8 
Thus the limit distribution of X,, is normal with mean and variance satisfying 

2 


n oO 
E(Xn)~ -, V(Xn) y~ n=, 
(Xn) T,  V(Xn) on 


G(z,u) = 


(=)? ove 


where js := m(g) and o? := v(q) are the mean and variance attached to g. (This calcula- 
tion checks the variability condition en passant.) The mean value result certainly conforms to 
probabilistic intuition. 2.0.0.0... cece eee ee ee END OF EXAMPLE IX.9. 


> IX.20. Renewals every day. In the renewal scenario, no longer condition on the fact that a 
bulb breaks down on day n. Let Y;, be the number of bulbs consumed so far. Then the BGF of 
Y,, is found by expressing that there is a sequence of renewals followed by a last renewal that is 
to be credited to all intermediate epochs: 


Pr EORTC 
Soe n) 1—ug(z) l-z ¢ 


n>1 
A Gaussian limit also holds for Yn. <J 


>> IX.21. A mixed CLT-renewal scenario. Consider G(z,u) = 1/(1 — g(z,u)) where g has 
nonnegative coefficients, satisfies g(1, 1) = 1, and is analytic at (z,u) = (1,1). This models 
the situation where bulbs are replaced but a random cost is incurred, depending on the duration 
of the bulb. Under general conditions, a limit law holds and it is Gaussian. This applies for 
instance to H(z,u) = 1/(1 — a(z)b(u)), where a and b are nondegenerate PGFs (a random 
repairman is called). <q 


The preceding discussion of renewal processes also brings us extremely close 
analytically to a sequence schema F = G(G) and 


1 


F(z,u) = T—ug(z)’ 
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FIGURE IX.8. When components are sorted by size and represented by vertical 
segment of corresponding length, supercritical sequences present various profiles 
described by Proposition IX.8. The diagrams display the mean profiles of large 
surjections, alignments, and compositions for component sizes < 5. 


in the case where the schema is critical. It is then possible to refine the moment 
estinmates of Chapter V and obtain the probabilistic profile of supercritical sequences. 


Proposition IX.8 (Supercritical sequences). Consider a sequence schema that is su- 
percritical, i.e., the value of g at its dominant positive singularity satisfies T, > 1. 
Assuming g to be aperiodic and g(0) = 0, the number X,, of G-components in a 
random F,, structure of some large size n is asymptotically Gaussian with 


where o is the radius of convergence of g. The number x of components 
of some fixed size m is asymptotically normal with mean ~ O0m,n, where 0m = 


gmo™ /(og'(o)). 
PROOF. The first part is a direct consequence of Theorem IX.8 and of the previous 
calculations with p replacing 1. The second part results from the BGF 


ee errs 
Vail gg 2 ale) 


and from the fact that u = 1 induces a smooth perturbation of the pole at p corre- 
sponding to u = 1. 

This proposition aplies to alignments, surjections, compositions of various 
sorts—including compositions into prime summands. The profile of supercritical se- 
quences is then appreciably different from what was obtained in the subcritical case, 
where discrete limit laws prevail. Fundamentally, the proportion of fixed size com- 
ponents is close to #, up to Gaussian fluctuations. The diagrams of Chapter V and 
Figure 8 clearly illustrate this situation. 


IX. 6. PERTURBATION OF MEROMORPHIC ASYMPTOTICS 611 


> IX.22. Alignments and Stirling cycle numbers. Alignments are sequences of cycles (Chap- 
ter II), corresponding to G(€>1(Z)), with exponential BGF 
1 


Et) 1 —wlog(1— z)-!" 


The function p(1u) is explicit, p(w) = 1 — e~!/“, and the number of cycles in a random align- 
ment is asymptotically Gaussian. This yields an asymptotic statement on Stirling cycle num- 


bers: Uniformly for all real x, with On = >, k! (7 the alignment number, there holds 


1 n 1 Fe Dako 1 
es y ! was) See we /2 ss 
O i‘ 4 wal. ww +0(s). 


m k<Cyn+2/Con 
1 1 
where the two constants C1, C2 are C, = ——, C2 = ———. <q 
e-1 (e— 1)? 


> TX.23. Summands in constrained integer compositions. Consider integer compositions where 
the summands are constrained to belong to a set 1 C N + and let X,, be the number of sum- 
mands in a random composition of integer n. The ordinary BGF is 


F(z,u) = = g(z) = ies 


T= ug(2)’ a 


Assume that I contains at least two relatively prime elements, so that g(z) is aperiodic. The 
radius of convergence of g(z) can only be oo (when g(z) is a polynomial) or 1 (when g(z) 
comprises infinitely many terms but is dominated by (1— z)~*). At any rate, the sequence con- 
struction is supercritical, so that the distribution of X,, is asymptotically normal. For instance, a 
Gaussian limit holds for compositions into prime or even twin-prime summands of Chapter V. 


<J 


The next two examples are relative to runs in permutations and patterns in words. 
They do not resort to a supercritical sequence but their analytic structure is very much 
similar. It is of interest to note that the BGFs were each deduced in Chapter II by an 
inclusion-exclusion argument that involves sequences in an essential way. 


EXAMPLE IX.10. Ascending runs in permutations and Eulerian numbers. The exponential 
BGF of Eulerian numbers (that count runs in permutations) is 


u(1 — u) 


F(z,u) = an err 


where, for u = 1, we have F(z, 1) = (1 — z)~+. The roots of the denominator are then 


2ikn log u 
adi ’ p(u) —_ 


where & is an arbitrary element of Z. As u is close to 1, p(w) is close to 1, while the other 
poles px(u) with k ~ 0 escape to infinity. This fact is also consistent with the limit form 
F(z, 1) = (1— z)~+ which has only one pole at 1. If one restricts u to |u| < 2, there is clearly 
at most one root of the denominator in |z| < 2 that is given by p(w). Thus, we have for u close 
enough to 1, 


pe(u) == plu) + ext 


F(z,u) = 
with R(z, wv) analytic in |z| < 2, and 
e"|F(z,u) = p(u)-"-? +0(27). 
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FIGURE IX.9. The diagrams of poles of the BGF F(z, u) associated to the 
pattern abaa with correlation polynomial c(z) = 1+ z° when u varies on the unit 
circle. The denominator is of degree 4 in z: one branch, p(x) clusters near the 
dominant singularity p = 4 of F(z, 1) while three other singularities stay away 
from the disc |z| < 4 and escape to infinity as u — 1. 


The variability conditions are satisfied since 


_ logu 1 
plu) = @=ne 1 3 (u 


so that 0(1/p(u)) = + is nonzero. 


Proposition IX.9. The Eulerian distribution is asymptotically Gaussian, with mean and vari- 
— ntl 2 n+1 


ance given by fn = 73>, On = 3. 


This example is a famous one and our derivation follows Bender’s paper [28]. The Gauss- 
ian character of the distribution has been known for a long time; it is for instance to be found in 
David and Barton’s Combinatorial Chance [108] published in 1962. There are in this case inter- 
esting connections with elementary probability theory: if U; are independent random variables 
that are uniformly distributed over the interval [0, 1], then one has 


[2u*|F(z,u) = P{[U1 +--+ Un] < k}. 


Because of this fact, the normal limit is thus often derived a consequence of the central limit 
theorem of probability theory, after one takes care of unimportant details relative to the integer 
part |-| function; see [108, 422]. ........... eee END OF EXAMPLE IX.10. 


EXAMPLE IX.11. Patterns in strings. Consider the class F of binary strings (the “texts”), 
and fix a “pattern” w of length k. Let x be the number of (possibly overlapping) occurrences 
of w. (The pattern w occurs if it is a factor, i.e., if its letters occur contiguously in the text.) Let 
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F(z,u) be the BGF relative to the pair (F, x). The Guibas-Odlyzko correlation polynomial? 
c(z) = cw(z) relative to w is defined for instance in [434], where it is shown that the OGF of 
words with pattern w excluded is 


c(z) 
F = 
0) zk + (1 — 2z)c(z) 
By similar string decompositions, the full BGF is found to be [192, p. 145] 
eee b= (e(z) = (w=) | 

1—2z— (w—1)(z* + (1 — 2z)(c(z) — 1)) 
Let D(z, u) be the denominator. Then D(z, u) depends analytically on z, for uw near 1 and 
z near 1/2. In addition, the partial derivative D!(4,1) is nonzero. Thus, p(w) is analytic 


at u = 1, with p(1) = 1/2. The local expansion of the root p(u) of D(p(u), u) follows from 
local series reversion, 


2p(u) = (1—2°-(u — 1) + (KO — 2-Fo(5)) (uw — 1)? + O((u - 1°). 


Theorem IX.8 applies. 


Proposition IX.10. The number of occurrences of a fixed pattern in a random large string is 
asymptotically normal. The number of occurrences has mean and variance o2, that satisfy 


sE +011), oF= (za 20(5)) pore 2k) ) n+O(1). 


The mean does not depend on the order of letters, only on the length of the pattern. 
END OF EXAMPLE IX.11. 


> IX.24. Patterns in Bernoulli texts. Asymptotic normality also holds when letters in strings 
are chosen independently but with an arbitrary probability distribution. It suffices to use the 
weighted correlation polynomial described in a note of Chapter III. <q 


EXAMPLE IX.12. Parallelogram polyominoes. Polyominoes are plane diagrams that are 
closely related to models of statistical physics, while having been the subject of a vast com- 
binatorial literature. This example has the merit of illustrating a level of difficulty somewhat 
higher than in previous examples and typical of many “real-life” applications. Our presentation 
follows an early article of [30] and a more recent paper of Louchard [342]. We consider here 
the variety of polyominoes called parallelograms. A parallelogram is a sequence of segments, 


(a1, bi], [a2, be], nals [am; Om], a1 < ao-+: < am, b1 < bo < +++ < bm, 


where the a; and b; are integers with b; — a; > 1, and one takes a, = 0 for definiteness. A 
parallelogram can thus be viewed as a stack of segments (with [a;+1,bj+1] placed on top of 
[a;, b;]) that leans smoothly to the right: 


>The correlation polynomial, as defined in Chapter I, has coefficients in {0, 1}, with [z4]c(z) = 1 iff 
w matches its left shifted image by 7 positions. 
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(This instance has area 39, width 13, height 9, and perimeter 13 + 9 = 22.) 

The quantity ™ is called the height, the quantity b,, — a1 the width, their sum is called 
the (semi)perimeter, and the grand total 5~ j (b; — aj) is called the area. We examine paral- 
lelograms of fixed area and investigate the distribution of the perimeter. The ordinary BGF of 
parallelograms, with z marking area and u marking perimeter turns out to be 


Ji(z, u) 
30 F =ya 
where Jo, Ji belong to the realm of ““q—analogues” and generalize the classical Bessel functions, 
—| ryrgr(rt)/2 a) m—Lyyr gr (n+ 1)/2 
iG ys. hae ye 
(q q)n (ug; q)n (4; ¢)n-1(UG; @)n 


n>=0 n>1 


with the “q—factorial” notation being used: 


(a; 9)n = (1— a)(1—aq)---(L— ag"). 

The expression (30) of the BGF results from a simple construction: a parallelogram is 
either an interval, or it is derived from an existing parallelogram by stacking on top a new 
interval. Let G(w) = G(a,y,z,w) be the OGF with x, y, z, w marking width, height, area, 
and length of top segment, respectively. The GF of a parallelogram made of a single nonzero 


interval is 
xryZzw 


1—azw 
The operation of piling up a new segment on top of a segment of length m that is represented 
by aterm w™ is described by 


z™w™ ena aie seiko 1—2™w™ 
Y\ Taz 1l—azw a (1 — zw)(1 — xzw) 


a(w) = a(az,y,z,w) = 


Thus, G satisfies the functional equation, 


xcyzw cyzw 
1 = 4+  —_|G(1) - . 
on Gly) l—azw (1—zw)(1— 2xzw) ee ee) 
This is the method of “adding a slice” already employed in Chapter III. and reflected by the 


relation (31). Now, an equation of the form, 
G(w) = a(w) + (w)[GQ) — GQw)], 
is solved by iteration: 
Gw) = a(w) + b(w)G(1) — b(w)G(A~w)) 

(a(w) — b(w)a(Aw) + b(w)b(Aw)a(A2w) — +++) 

+G(1) (b(w) — b(w)b(Aw) + b(w)b(Aw)b(A7w) — +++). 
One then isolates G(1) by setting w = 1. This expresses G1) as the quotient of two similar 
looking series (formed with sums of products of b-values). Here, this gives G(x, y, z, 1), from 
which the form (30) of F(z, u) derives, since F'(z,u) = G(u, u, z, 1). 

In such a seemingly difficult situation, one should first estimate [z”] F(z, 1), the number 
of parallelogram of “size” (i.e., area) equal to n. We have F(z, 1) = Ji(z,1)/Jo(z, 1), where 
the denominator is 


z 2 28 
i sa fs 
(Pay ° ol 2a =")? A eg atl zt) 2)? 
Clearly, Jo(z,1) and Ji(z,1) are analytic in |z| < 1, and it is not hard to see that Jo(z, 1) 
decreases from | to about —0.24 when z varies between 0 and 3, with a root at 


p = 0.43306 19231 29252, 


Jo(z, 1) =1- 
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where Jj(p,1) + —3.76 4 0, so that the zero is simple®. Since F(z, 1) is by construction 
meromorphic in the unit disc, and J:(, 1) = 0.48 4 0, the number of parallelograms satisfies 


z’)F(z,1) ~ —-—- [ - =a,-ay, 
ee rey ee 
where 
a1 = 0.29745 35058 07786, a2 = 2.30913 85933 31230. 


As is common in meromorphic analyses, the approximation of coefficients is quite good; for 
instance, the relative error is only about 107° for n = 35. 


We are now ready for bivariate asymptotics. Take |z| < r = +5 and |u| < +4. Because 
: : 2 ; ; ' 
of the form of their general terms that involve z” /?u” in the numerators while the denomi- 


nators stay bounded away from 0, the functions Jo(z, wu) and J;(z,u) remain analytic there. 
Thus, p(w) exists and is analytic for u in a sufficiently small neighbourhood of 1 (by Weier- 
strass preparation or implicit functions). The nondegeneracy conditions are easily verified by 
numerical computations. There results that Theorem IX.8 applies. 

Proposition IX.11. The perimeter of a random parallelogram polyomino of area n admits a 
limit law that is Gaussian with mean and variance that satisfy: Un ~ [bN, On ~ OrV/N, with 


be = 0.84176 20156, o = 0.42420 65326. 


This indicates that a random parallelogram is most likely to resemble a slanted stack of 
fairly short segments. ........ 0.0... cece eee cece ee eee e ene eee END OF EXAMPLE IX.12. 


> IX.25. Width and height of parallelogram polyominoes are normal. Similar perturbation 
methods show that the expected height and width are each O(n) on average, again with Gauss- 
ian limits. < 


> IX.26. The base of a coin fountain. A coin fountain (Chapter IV) is defined as a vector 
v = (vo, V1,---, Ue), such that vo = 0, vj; > Ois an integer, ve = 0 and |v;+1 — vj| = 1. Take 
as size the area, n = 5+ v;. Then the distribution of the base length @ in a random coin fountain 
of size n is asymptotically normal. (This amounts to considering all ruin sequences of a fixed 
area as equally likely, and considering the number of steps in the game as a random variable.) 
Similarly the number of vector entries equal to 0 is asymptotically Gaussian. dq 


Perturbation of systems of linear equations. There is usually a fairly transparent 
approach to the analysis of BGFs defined implicitly as solutions of functional equa- 
tions. One should start with the analysis at uw = 1 and then examine the effect on 
singularities when u varies in a very small neighbourhood of 1. In accordance with 
what we have already seen many times, the process is a perturbation analysis of the 
solution to a functional equation near a singularity, here one that moves. 

We illustrate, mostly by way of examples, the application of Theorem to functions 
defined implicitly by a linear system of positive equations. Positive rational functions 
arise in connection with problems that can be equivalently described by finite state 
devices, by paths in graphs, and by Markov chains. The bivariate problem is then 
expressed by a linear equation 


(32) Y(z,u) =V(z,u)+T(z,u)-Y(z,u), 


®As usual, such computations can be easily validated by carefully controlled numerical evaluations 
coupled with Rouché’s theorem (see Chapter IV). 
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where T(z, u) is an m x m matrix with polynomial entries in z, u having nonnegative 
coefficients, Y (z, wu) is an m x 1 column vector of unknowns, and V (z, w) is acolumn 
vector of nonnegative initial conditions. 

Regarding the univariate problem, 


(33) Y(z) =V(z)+T(z)-Y(z). 
, where Y(z) = Y(z,1) and so on, we place ourselves under the assumptions of 
Corollary V.1 of Chapter V. This means that properness, positivity, irreducibility, and 
aperiodicity are assumed throughout. In this case (see the developments of Chapter V), 
Perron-Frobenius theory applies to the univariate matrix T(z). In other words, the 
function 

C(z) = det(I — T(z)) 
has a unique dominant root p > 0 that is a simple zero. Accordingly, any component 
F(z) = Yj(z) of a solution to the system (32) has a unique dominant singularity 
at z = ¢ that is a simple pole, 


with B(p) 40. 

In the bivariate case, each component of the solution to the system (32) can be 
put under the form 
B(z,u) 
C(z,u)’ 
Since B(z, wu) is a polynomial, it does not vanish for (z, wu) in a sufficiently small 
neighbourhood of (p, 1). Similarly, by the analytic Implicit Function Theorem, there 
exists a function p(w) locally analytic near wu = 1, such that 


C(p(u),u)=0, = p(l) =p. 
Thus, it is sufficient that the variability conditions (26) be satisfied to infer a limit 
Gaussian distribution. 


F(z,u) = C(z,u) = det(I — T(z, u)). 


Theorem IX.9 (Positive rational systems). Let F(z, u) be a bivariate function that 
is analytic at (0,0) and has nonnegative coefficients. Assume that F(z, u) coincides 


with the component Y, of a system of linear equations in Y = (Y;,...,Ym)*, 
Y=V-4T-Y, 
where V = (V\(z,u),..-,Vm(z,u)), T = (Ti, 1327) eee and each of V;,T;,; 


is a polynomial in z,u with nonnegative coefficients. Assume also that T(z,1) is 
transitive, proper, and primitive, and let p(u) be the unique solution of 


det(I — T(p(u),u)) = 0, 
assumed to be analytic at 1, such that p(1) = p. Then, provided the variability condi- 


tion, (002) - 


is satisfied, a Gaussian Limit Law holds for the coefficients of F(z, u) with mean and 
variance that are O(n) and speed of convergence that is O(n~'/2). 
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The constants 44,0 involved in estimates of the mean and standard deviation, 
Lin © LN, On ~ ov/n, are then determined from C(z,u) = det(I — T(z, u)) by 
Eq. (26). Thus, in any particular application, one can determine by computation 
whether the variability condition is satisfied. It may be however more difficult to 
check these conditions for a whole classes of problems. 


EXAMPLEIX.13. Limit theorem for Markov chains. Assume that M is the transition matrix 
of an irreducible aperiodic Markov chain, and consider the parameter y that records the number 
of passages through state 1 in a path of length n that starts in state 1. Then, the theorem applies 
with 

V= (1, 0, sey 0)", LailZ, u) — 2Mi,; + z(u = 1) Mi,065,0- 
We therefore derive a classical limit theorem for Markov chains: 
Proposition IX.12. In an irreducible and aperiodic (finite) Markov chain, the number of times 
that a designated state is reached when n transitions are effected is asymptotically Gaussian. 


The conclusion also applies to paths in any strongly connected aperiodic digraph as well 
as to paths conditioned by their source and/or destination. ..... END OF EXAMPLE IX.13. 


> IX.27. Sets of patterns in words. This note extends Example 11 relative to the occurrence of 
a single pattern in a random text. Given the class W = G(.A) of words over a finite alphabet A, 
fix a finite set of “patterns” S C W and define the parameter y(w) as the total number of 
occurrences of members of S in the word w € W. It is possible to build finite automaton 
(essentially a digital tree built on S equipped with return edges) that records simultaneously the 
number of partial occurrences of each pattern. Then, the limit law of x is Gaussian; see Bender 
and Kochman’s paper [35] and [189, 192] for an approach based on the de Bruin graph. dq 


Virtually all of the combinatorial classes that resort to transfer matrix methods 
exposed in Chapter V lead to Gaussian laws in the asymptotic limit. 


EXAMPLEIX.14. Tilings. (See Bender [37].) Take an (2 x n) chessboard of 2 rows and n 
columns, and consider coverings with “monomer tiles” that are (1 x 1)-pieces, and “dimer tiles” 
that are either of the horizontal (1 x 2) or vertical (2 x 1) type. The parameter of interest is here 
the number of tiles. Consider next the collection of all “partial coverings” in which each column 
is covered exactly, except possibly for the last one. The partial coverings are of one of 4 types 
and the legal transitions are described by a compatibility graph. For instance, if the previous 
column started with one horizontal dimer and contained one monomer, the current column has 
one occupied cell, and one free cell that may then be occupied either by a monomer or a dimer. 
This finite state description corresponds to a set of linear equations over BGFs (with z marking 
the area covered and u marking the total number of tiles), with the transition matrix found to be 


UU U U 


T(z,u) =z 


S 
ooo 


S 


In particular, we have 


det(I — T(z,u)) =1—zu—27(u? +.u°). 


Then, Theorem IX.9 applies: the number of tiles is asymptotically normal. The method clearly 
extends to (k x n) chessboards, for any fixed k. ............. END OF EXAMPLE IX.14. 


> IX.28. Succession-constrained integer compositions. Consider integer compositions where 
consecutive summands add up to at least 4. The number of summands in such a composition of 
large size is asymptotically normal. [Hint: see Bender and Richmond [37]] J 
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> IX.29. Height in trees of bounded width. Consider general Catalan trees of width less than a 
fixed bound w. (The width is the maximum number of nodes at any level in the tree.) In such 
trees, the distribution of height is asymptotically Gaussian. <q 


IX.7. Perturbation of singularity analysis asymptotics 


In this section, we examine schemes that arises when generating functions con- 
tain algebraic-logarithmic singularities. For instance, trees often lead to singularities 
that are of the square-root type and such a singular behaviour persists for a num- 
ber of bivariate generating functions associated to aditively inherited parameters. In 
such cases, the underlying machinery is the method of singularity analysis detailed in 
Chapter VI, on which suitable perturbative developments are applied. 

An especially important feature of the method of singularity analysis and of the 
associated Hankel contours is the fact that it preserves uniformity of expansions’. 
This feature is crucial in translating bivariate expansion, where we need to estimate 
uniformly a coefficient f,(u) = [z"]F'(z, u) that depends on the parameter wu, given 
some (uniform) knowledge on the singular structure of F'(z, wu) in terms of z. We state 
here an easy but crucial lemma that takes care of remainder terms in expansions and 
hence enables the use of singularity analysis in a perturbed context. 


Lemma IX.2 (Uniformity lemma, singularity analysis). Let f.,(z) be a family of func- 
tions analytic ina common A-domain A, with u a parameter taken in a bounded set U. 
Suppose that there holds 


|fu(z)| < K(u)[1 — 2[-°™, 


where K(u) is uniformly bounded, K(u) < K for wu € U, and a(u) is such that 
—R(a(u) > B for some finite real B. Then, there exists a constant K (computable 
from A, K, B such that 


Ile" fulz)| < KnP. 


PROOF. It suffices to revisit the proof of the Big-Oh transfer (O-transfer) theorem 
of Chapter VI, paying due attention to uniformity. The proof proceeds by Cauchy’s 
formula, 
‘ 1 dz 
fam = [e"\fule) = = f ful) 

where y = U;7; 1s the contour used earlier. Accordingly, we let f 2) be the contribu- 
tion in Cauchy’s integral arising from part 7; of the contour. Let r be the radius of the 
circular part of the contour, corresponding in earlier notations to y3. Without loss of 
generality, we may assume |r — 1| < 1. Trivial bounds imply when B > 0 that that 


K —n 


(3 — 
If. IS GopeAr ? 


Un 


TBor instance, Darboux’s method only provides non-constructive error terms, as it is based on the 
Riemann-Lebesgue lemma; it cannot be employed for bivariate asymptotics. A similar comment applies to 
most Tauberian theorems. 
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FIGURE IX.10. A display of the family of GF’s F(z, uo) corresponding to 


leaves in general Catalan trees when uo € [5, 3). It is seen that the singularities 


are all of the square root type (dashed line), with a movable singularity at p(u) = 
@ +y ee 


with an analogous formula if B < 0. The part 7; corresponding to the small circular 
arc at distance 1/n from | is similarly dealt with by trivial bounds to the effect that 


pe ska 


The two conjugate rectlinear parts corresponding to 72, y4 each lead to 


K Co 1 n 
Jun) = AS Stun, In [m8 (14 Atc080) 
7 1 n 


Combining the four majorizations yields the result. What this lemma expresses 
is more general than the meromorphic scheme; only the error terms in estimates of 
PGFs tend to be naturally less good as we replace an exponentially small error term 
inherent to meromorphic functions by a term that is usually O(n~*) in the context of 
singularity analysis. (Note that the proof above also supplies the uniformity estimates 
needed in the proof of the little-oh transfer (o-transfer) of Chapter VI.) 


> IX.30. Uniformity in the presence of lagarithmic multipliers. Similar estimates hold when 
f(z) is multiplied by a power of L(z) = — log(1 — z). dq 


EXAMPLEIX.15. Leaves in general Catalan trees. As an introductory example, let us briefly 
revisit the analysis of the number of leaves in general Catalan trees, a problem already treated 
in Chapter III. where an explicit expression (a product of two binomial coefficients) has been 
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derived. The computations are a little simpler if we adopt as BGF 


G(z,u) = F(z,w?) = 5 (1 + (u? —1)z—- f1— 202 + z+ (we? — 172?) ; 


so that we consider a parameter equal to twice the number of leaves. In this case, the discrimi- 
nant factors nicely: 


L-3ar S e + e 1) 2 = eay \a a — a"); 


which leads to the expression 
(34) G(z,u) = A(z,u) + B(z, u)VC(z, w), 


with 


ACTS s(t LP aa. BOA = -5vi Sed Se. 


1 
Clz,u) = 5 (1—2(1+4)). 
This decomposition clearly shows that, when wu is close enough to 1, the function G(z, u) has a 
dominant singularity of the square-root type at 
1 
= Tae 

At the same time, if u is kept such that |1 — u| < 4, then B(z, u) remains analytic in both of 
its arguments for |z| < 2. For any such fixed u, we have for the BGF, by (34), 


(35) G(z,u) = ao(u) + b(u) V1 = 2/p(u) + a1(u)(1 — 2/p(u)) + O((1 — 2/p(u))*”), 
for some computable coefficients ao, ai, b, c that depend on u and are in fact analytic in wu near 
u = 1. Singularity analysis then provides, pointwise for each u, 


(36) [2"|G(z, u) = = B(p(u), u) p(u)"n*/? (: ia a4) . 


The expansion (35) is uniform when uw lies in a sufficiently small complex neighbourhood of 1. 
It can be seen (details below) that the expansion of the coefficient in (36) is also uniform by 
virtue of of the general uniformity preserving property of the singularity analysis process, 
as expressed by Lemma IX.2. We are thus exactly in a case of application of the Quasi- 
Powers Theorem, so that the limit law for the number of leaves is asymptotically Gaussian. 
END OF EXAMPLE IX.15. 


IX. 7.1. General algebraic-logarithmic conditions. The example of leaves in 
tres leads to simple computations, but is is characteristic of the machinery needed in 
more general cases. The theorem that follows is relative to any singular exponent a 
not in Z<o. 

Theorem IX.10 (Algebraic singularity schema). Let F(z, u) be a bivariate function 
that is bivariate analytic at (z,u) = (0,0) and has nonnegative coefficients there. 
Assume the following conditions: 
(i) Algebraic perturbation: there exist three functions A, B,C, analytic in a 
domain D = {|z| < r} x {|ju—1| < €}, for some r > O and e > 0, such 
that the following representation holds, 


(37) F(z,u) = A(z,u) + B(z,u)C(z,u), 
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that p <r is the unique (simple) root in |z| < r of the equation C(z,1) = 0, 
and that B(p,1) # 0. 

(it) Nondegeneracy: one has 0,C(p,1) + OuC(p,1) 4 0, ensuring the existence 
of a nonconstant p(w) analytic at u = 1, such that C(p(u),u) = 0 and 
p(1) =p. 

(iti) Variability: one has 


Then, the random variable with probability generating function 


[2"| F(z, uv) 
uh) = 
Pol) = aE, 1) 
converges in distribution to a Gaussian variable with a speed of convergence that is 
O(n-V/?), The mean [ty and the standard deviation o,, are asymptotically linear in n. 


The remarks following the statement of Theorem IX.8 apply. Accordingly, the 
mean j1,, and variance o2 are computable by the general formula (25), and the vari- 
ability condition is expressible in terms of the values of C and its derivatives at (p, 1) 
by means of Equation (27). 


PROOF. Observe first that one does not need to worry about the a priori domain of 
existence of F'(z, uw) since Equation (37) provides automatically analytic continuation 
to a collection of A-domains at p(u) when wu varies. Thus, it suffices that the repre- 
sentation (37) be established initially in some open domain of {|z| < p} x {|u| < 1}, 
by unicity of analytic continuation. 

By the assumptions made, the function F'(z, 1) admits a singular expansion of the 
form 
(38) 

F(z,1) = (ao9+ai(z-—p)+---) 
+ (by + bi(z — p) +++) (er(z — p) + e2(z— pp)? +---) *. 

There, the a;,b;, cj; represent the coefficients of the expansion in z of A, B,C for 
z near p when u is instantiated at 1. (We may consider C(z, u) normalized by the 
condition that c; is positive real, and take, e.g., cy = 1.) Singularity analysis then 
implies the estimate 


a-l 
G9) 2"1F(1) = bola)" (14.0(5)). 

All that is needed now is a “lifting” of relations (38) and (39), for wu in a small 
neighbourhood of 1. First, we observe that by the analyticity assumption on A, the 
coefficient [z”] A(z, u) is exponentially small compared to p~”, for u close enough 
to 1. Thus, for our purposes, we may freely restrict attention to B(z, u)C(z,u)~®. 
(The function A is only needed in some cases so as to ensure nonnegativity of the 
first few coefficients of F’.) Next, it is convenient to operate with a fixed rather than 
movable singularity. This is simply achieved by considering the normalized function 


B(z,u):=B (x) C (Su). 
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Provided u is restricted to a suitably small neighbourhood of | and z to |z| < R for 
some R >’, the functions B(z/p(u), u) and C(z/p(u), u) are analytic in both z and u, 
with C(z, wu) having a fixed simple zero at z = 1. There results that the function 


era ord 


has a removable singularity at z = 1 and is in fact analytic in |z| <r, |u—1| < 6. 
Thus, ® satisfies an expansion of the form 


®(z,u) = (1-2) ae dn(u)(1 — z) 
n>0 
that is convergent and such that each coefficient ¢;(u) is an analytic function of u for 
ju-1] <0. 
We may restrict this neighbourhood as we please, with |u — 1| < 6 provided we 
keep € > 6 > 0. First, by Weierstrass preparation, there is for wu sufficiently near to 1, 
a unique simple root p(w) near p of the equation 


C(o(u),u) = 0. 
We have p(1) = p with p(u) being locally analytic at 1. One can then expand A, B,C 
near (p(w), u). This gives the bivariate expansion 


(40) 
F(z,u) = (ao(u) + ai(z— p(u)) +--+) 
+ (bo(u) + bi(u)(z — p(u)) +++) (ex(u)(z — p(u)) + c2(u)(z — plu)? +2). 


z 
There, by assumption, we have that a;(u), bj;(w),c;(u) are analytic in Ju — 1| < «, 
and are each O(r—”). In addition, p(w)® and (—ci(u))® are well-defined by principal 
values, since their specializations at u = 1 are positive. Thus, we have a singular 
expansion for F'(z, u); for instance, when a €] — 1, 0[, 


+ bo(u)(—e1(u)p(u)) “(1 — z/p(u))~* + R(z), 
where 
R(z) =O ((1— z/p(u))***) 


and the O-error term is uniform for |u — 1] < 6: 
|R(z)| < K+ |1— 2/p(u)|, 


for some absolute constant AK. We thus have 


(42) [2"|F(z,u) = bo(u)(—ex(w)aw) ol) (1 fe o(-)) . 


where the error term is again uniform. An especially important fact for this argument 
is the following: the singularity analysis process is a uniform coefficient extraction 
method. This is precisely provided by Lemma IX.2. 

Equation (42) shows that f,(u) = [z”]F'(z, u) satisfies precisely the conditions 
of the Quasi-Powers Theorem. Therefore, the law with PGF f;,(w) / fn(1) is asymptot- 
ically normal with a mean and a standard deviation that are both O(n). Since the error 
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term in (42) is O(1/n), the speed of convergence to the Gaussian limit is O(1/,/n). 


TX.31. Logarithmic multipliers. The conclusions of Theorem IX.10 extend to functions 
representable under the more general form 


F(z,u) = A(z,u) + B(z,u)C(z, u)~* (log C(z, u))! 
(The proof follows exactly the same pattern.) dq 


EXAMPLEIX.16. Leaves in classical varieties of trees. We start with binary Catalan trees and 
with the BGF 


F(z,u) = 2(u+ 22F(z,u) + F(z,u)”), 
so that 


F(2,0) = 5 (1 az — J 2A + ud — 2A — u))) . 


This is almost the same as the BGF of leaves in general Catalan trees. The dominant singularity 
is at p(u) = EEEOE and the limit law is Gaussian. The asymptotic form of the mean and 
variance are immediately derived from p, and we find that the number of leaves X,, in a binary 
Catalan tree satisfies 


E{Xn}=74+00),  o{Xn} = vn 4 O(n-"), 
In the case of Cayley trees, the BGF equation’ is 
F(z,u) = z(u—14+e7@™), 


By Lagrange inversion, the distribution is related to the Stirling partition numbers. The func- 
tional equation admits an explicit solution in terms of Lambert’s “W-function’”, which is such 
that z = We™, with the branch choice that W = 0 when z = 0. Thus, W(z) = —T(—z), 
where T’ = ze” is the classical “Cayley tree function”. Here, we have 


F(z,u) = z(u—1) — W(—ze*“"). 


al 


The function W has a dominant singularity of the square-root type at —e~~. Thus, one can 


solve for p(w), again in terms of the W function. Here, we find 


In particular, we get p(1) = e +, as we should. The expansion near u = 1 then comes 
automatically 


BAe e*(u eee 2 u—1)° 
Pay a hae ue 1) + Feu 1)? + Olu 1"), 


Hence the mean and the variance of the number X,, of leaves in a random tree of size n satisfy: 


E{Xn} ~ e7!n & 0.36787n, =o * {Xn} ~ e-7(e— 2) n & 0.09720n, 


and the limit law is a Gaussian. ...........0. 00 cece cece eee END OF EXAMPLE IX.16. 


> IX.32. Leaves in Motzkin trees. The number of leaves in a unary-binary (Motzkin) tree is 
asymptotically Gaussian. dq 


8This example constitutes a typical application of symbolic manipulation systems. 
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EXAMPLE IX.17. Patterns in binary Catalan trees. We develop here a more sophisticated 
example coming from the analysis of pattern matching in trees [456, 209] that generalizes the 
problem of leaves. Fix a nonempty binary tree w and let w[t] = w»[t] be the number of 
occurrences of pattern w in tree t. By this, we mean the number of internal nodes v in t such 
that the subtree of t rooted at v is isomorphic to w. The problem is of interest in the analysis 
of some symbolic manipulation algorithms and of “sharing” strategies; see [456, 209] for the 
algorithmic context. 

A pattern occurs either in the left root subtree to or in the right root subtree ¢; or at the 
root iself if ¢ coincides with w. This gives rise to the recursive definition 


w(t] =v[to] +efiJ+t=w], [0] =0, 


where [P] denotes the indicator function of P whose value is 1 if P is true, and 0 otherwise. 
The function u”!! is almost multiplicative, and 


yt) = ylt=tiol ett) = yettol yl) 4 fe = wy] (u— 1). 


Thus, the bivariate generating function F(z, u) where z marks internal nodes and u marks the 
number of occurrences of w, 
F(z,u):= yee, 


t 
satisfies the algebraic equation, 
F(z,u) =1+(u—1)z"4 2F(z,u)’, 


with m = |w| the number of internal nodes of w. 
The quadratic equation for F’ leads to 


FO a= = (1- f1-&—-4"7@—1). 


The discriminant has a unique root p = 1/4 when u = 1, while it has m + 1 roots for u 4 1. 
By general properties of implicit and algebraic functions (implicit function theorem, Weierstrass 
preparation), as u tends to 1, one of these roots, call it p(w) tends to 1/4 while all the other ones 
{pj (u)}j~1 escape to infinity. We have 


m 


=[] 0 - 2/os(w), 


j=l 


1— 42-427" (u—1) 
T= z/ptu) 
which is an analytic function in (z, wu) for (z, wu) in a complex neighbourhood of (1/4, 1). This 


results from the fact that the algebraic function 1/p(u) is analytic at wu = 1. It gives the singular 
expansion of G(z,u) = zF'(z,u): 


Gar = ; : ICON =a. 


Thus, we are exactly under the conditions of the theorem. The quantity w taken over a random 
binary tree of size n + 1 has mean and variance given asymptotically by 


oan)" *(cin)* 


The expansion of p(w) at 1 is computed easily by iteration of the defining equation: 


A(z,u) := 


2a ge u= FG eu ya Dt 
Thus, 
1 1 m+1 
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This shows that the mean jy and the variance o? of the number of occurrences of a pattern 
of size m in a random binary tree of size n satisfy 


n 2 Al 1 Zune 
Tf ee On™ apa 3 
M 4m 4m 42m 


also, the distribution is asymptotically Gaussian. In particular, the probability of occurrence of 
a pattern at a random node of a random trees decreases fast (the factor of 4~”” in the estimate of 
averages) with the size of the pattern, a property that was to be expected and that also holds for 
strings. The paper of Steyaert and Flajolet [456] shows that similar properties (equivalent to the 
mean value analysis) hold for any simply generated family. The expression of the BGF F'(z, u) 
is given by Flajolet, Sipala, and Steyaert in [209], where similar developments are used to show 
that the minimal “dag representation” of a random tree —identical subtrees are “shared” and 
represented only once— is of average size O(n(log n)~'/?). .. END OF EXAMPLE IX. 17. 


> IX.33. Patterns in classical varieties of trees. Patterns in general Catalan trees and Cayley 
trees can be similarly analysed. dq 


We shall see later that such laws, established here via explicit representations of 
the BGFs, extend to varieties of trees whose generating functions are only accessible 
implicitly via functional equations (Subsection IX. 7.3). 


IX. 7.2. The exponential-logarithmic schema. So far, the occurrence of a 
Gaussian law has been related to a movable singularity that causes coefficients of 
a bivariate generating function F'(z, wu) to obey a rough power law of the form 


fn(u) = [2"|F(, u) © plu)”, 


so that the Quasi-Powers Theorem applies with a scaling factor G, = n. In this 
section, we discuss the situation of a fixed singularity and variable exponent in sin- 
gular expansions. This means a somewhat stronger decomposition property for a 
BGF as the singularity remains constant when the auxiliary parameter u varies, as 
in F(z,u) = C(z)~*™. Typical cases of application are to the set constructions, 
where the analysis of number of components can be rephrased as the estimation of 
coefficients in 


F(z,u) = exp (uG(z)), 
when G‘(z) is, roughly speaking, logarithmic. In this case, we have parameters whose 


mean and variance grow logarithmically, a typical instance being the number of cycles 
in permutations. Analytically, this comes from an approximate form 


F(z,u) 2 (1—2z/p)™, 


so that 


fa(u) = [2"]F(z,u) & po’ nt@)-1 = a exp(a(u) log n). 


This is again a case of application of the Quasi-Powers Theorem, but now with a 
scaling factor 3, = logn. The developments in this section are inspired by a paper 
of Flajolet and Soria [210] who first extracted certain universally valid laws for such 
assemblies of logarithmic structures. 
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Theorem IX.11 (General variable exponent schema). Let F(z, u) be a bivariate func- 
tion that is analytic at (z,u) = (0,0) and has nonnegative coefficients there. Assume 
the following conditions. 


(i) Exponent perturbation. Assume that there exist € > 0 and r > p such that 
in the domain, 
D= {(z,u) | lz} <r,ju-—1)< ce}, 
the function F(z, u) admits the representation 
(43) F(z,u) = A(z,u) + Bz,u)C(2)- 
where A(z,u), B(z,u) are analytic for (z,u) € D, the function a(u) is 
analytic in |u — 1| < e with a(1) € {0,—-1, -2,...}, and C(z) is analytic 
for |z| <r, the equation C(¢) = 0 having a unique root ¢ = pin |z| <r 
that is simple, with B(p,1) 4 0. 
(it) Variability: one has 
a’ (1) + a"(1) £0. 
Then the variable with probability generating function 
2 |Fi(z,u 
pau) = 21a) 
[2"| F(z, 1) 
converges in distribution to a Gaussian variable and the speed of convergence is 


O((logn)~!/?). The corresponding mean |1n, and variance 02 satisfy 


Ln ~ a’(1) logn, a2 ~ a" (1) logn. 


PROOF. Clearly, for the univariate problem, by singularity analysis, one has 


no)-1 
EF(e.1) = B.C") Mo" (14012). 


For the bivariate problem, the contribution arising from [z”] A(z, wu) is exponentially 
small, since A(z, w) is z-analytic in |z| <r. 
Write next 
The first term satisfies 
B(z,u) — B(p,u) = O((z— p)), 
uniformly with respect to u, since 
B(z,u) = B(p,u) 
z—p 
is analytic in z and wu, by division of power series representations. Let A be an upper 
bound on a(u) on ju — 1] < e. Then, by singularity analysis and its companion 
uniformity, 
[2"|(Bl, u) — B(p,u))C(z)- 8 = O(p- nA), 


By suitably restricting the domain of u to |w — 1] < 6, one may freely assume that 
A—2<a(1) — +. Thus, the contribution from this part is small. 
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It only remains to analyse 
[2"]B(p,u)C(z)-°. 


This is done exactly like in the univariate case, again taking advantage of the unifor- 
mity afforded by singularity analysis. We find, uniformly for u in a small neighbour- 
hood of 1, 


B(p,u)p_” 
nT (a(u)) 


Thus, the Quasi-Powers Theorem applies and the law is Gaussian in the limit. 


[2") F(z, u) = (—pC"(p))- 2 ex) logn (1 is O(n-"/?)) 


The next proposition covers a scheme closely related to the exponential logarith- 
mic setting. Its proof only requires a slight modification of the calculations involved 
in the error terms. It complements Example 5 where the number of small components 
has been found to be Poisson. 


Proposition IX.13 (Sets of labelled logarithmic structures). Consider the labelled 
set construction F = 38(G). Assume that G(z) has radius of convergence p and is 
A-continuable with a singular expansion of the form 


1 1 
G = Klog ——— + A+ O | ——— } . 
eles 1—2z/p (a - ato) 


Then, the limit law of the number of G-components in a large F-structure is asymp- 
totically Gaussian with mean and variance both asymptotic to & log n. 


The bivariate EGF for permutations with u marking the number of cycles is 


F(z,u) => i ute = (1—2z)~" =exp («tos ; -) ‘ 


so that we are in the simplest case of an exponential-logarithmic schema. Theo- 
rem IX.11 implies that the number of cycles in a random permutation of size n con- 
verges to a Gaussian limiting distribution. This classical result stating the asymptot- 
ically normal distribution of the Stirling numbers (of the first kind) constitutes Gon- 
charov’s Theorem. It has already been stated with a direct proof in Proposition IX.6, 
thanks to the explicit character of the “horizontal” generating functions (the Stirling 
polynomials) in this particular case. 


EXAMPLE IX.18. Cycles in derangements. The number of cycles is asymptotically normal 
in generalized derangements where a finite set S of cycle lengths are forbidden. This results 
immediately from the BGF 


P(z,u) =exp(uG(e)), — G(2) = bg Le - Yo . 
ses 


The classical derangement problem corresponds to S = {1}; see [98]. 
END OF EXAMPLE IX.18. 


EXAMPLE IX.19. Clouds and 2-regular graphs. “Clouds” are defined in [98, p. 274] and they 
have already been encountered in Chapters II and VI: let n straight lines in the plane be given 
in general position, so that there are (3) intersecting points; a cloud of size n is a (maximal) 
set of n intersection points, no three of which are collinear. By duality, there is a one-to— 


one correspondence between clouds and 2—regular graphs. A 2-regular graph of size n is an 
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undirected graph with n edges, such that each vertex has degree exactly 2. Any 2-regular graph 
may be decomposed into a product of connected components that are (undirected) cycles of 
length at least 3. Hence the bivariate EGF for 2-regular graphs, with wu marking the number of 
connected components, is: 


1 1 z 2 enue /2—u2? /4 
F = jot 
(z,u) exp (u(5 ame =)) (aay? 
The function exp(u(z/2 + z7/4)) is entire, so that the conditions of Theorem IX.11 
are satisfied. Thus, the number of connected components in a 2-regular graph, (this is 
equivalent to the number of polygons in a cloud) has a Gaussian limiting distribution. 


END OF EXAMPLE IX.19. 


EXAMPLE IX.20. Random mappings. Let f denote a function that maps the set N = 
{1, 2,--- ,n} into itself. Such a function f may be represented by a directed graph Gy with 
vertex set NV and edge set {(i, f(i));i € N}. Such graphs, in which every point has out-degree 
one, are called functional digraphs; see [259, p. 68]. A functional digraph may be viewed as 
a set of components that are themselves cycles of rooted labelled trees. The bivariate EGF for 
functional digraphs with u marking connected components is 


F(z,u) = exp (u(log ae): 


where the generating function of rooted labelled trees T(z) is the Cayley tree function defined 
implicitly by the relation T(z) = zexp(T(z)). By the inversion theorem for implicit functions 


we have 
T(z) =1- V2. — ez) + S> ex(1 —ez)*”. 


k>2 
Thus, 


pg fae e2)*"*)) bs 


where H(v) is analytic at v = 0. From this form and Theorem [X.11, we obtain a theorem 
of Stepanov [454]: The number of components in functional digraphs has a limiting Gaussian 
distribution. 


1 
F(z,u) = exp {u(5 log = 


This approach extends to functional digraphs satisfying various degree constraints as con- 
sidered in [14]. This analysis and similar ones are relevant to integer factorization, using Pol- 
lard’s “rho” method [198, 304, 434]. ....................008, END OF EXAMPLE IX.20. 


Unlabelled constructions. In the case of unlabelled structures, the class F of multi- 
sets over a class G have OGF, 


Sy Ez = [[a Ber ak ce 
n>0 n>1 


By taking logarithms and reorganizing the corresponding series, we get the alternative 


form 
G2) , Ge), GG) ; 


F(z) =e (SEL, Sy 5 
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Similarly, in the bivariate case, where u marks the number of components, the bivari- 
ate GF is (see Chapter II), 


F(z,u) = a F,,,u*2” = exp ( 


n,k>0 


u ur ue 

—G(z) + —G(z?) + —G(2*)+--- }, 

1 2 3 

which is of the form exp(G(z))” - B(z, u). Here, we are interested in structures such 
that G(z) has a logarithmic singularity, in which case Theorem IX.11 applies, as soon 
as G(z) has radius of convergence p < 1. 


EXAMPLE [X.21. Polynomial factorization. Fix a finite field K = GF'(q) and consider the 
class P of monic polynomials (having leading coefficient 1) in K[z], with Z the subclass of 
irreducible polynomials. Obviously, P, = q”, so that 


P(z) =(1—qz)7'. 


Because of the unique factorization property, a polynomial is a multiset of irreducible polyno- 
mial, whence the relation 


I 
go ( D 3 


The preceding relation can be inverted using Mobius inversion. If we set L(z) = log P(z), 
then we have 


1(2) =F (ke) ED = 0g +S tty 2, 


where ju is the Mbius function. 

Since L(z*) is analytic for |z| < q~‘/? whenever k > 2, and |L(z*)| < c®*|z|*, the 
sum >> 5.5 W(k)L(z")/k is analytic for |z| < 7, with gq! <7 < q_}/?. Hence I(z) has an 
isolated singularity of logarithmic type at z = q~' < 1. 

Thus the average number of irreducible factors in a polynomial, and its variance, are both 
asymptotically log n + O(1) (this result appears in [304, Ex. 4.6.2.5]). Let Qn be the random 
variable representing the number of irreducible factors of a random polynomial of degree n over 
GF‘(q), each factor being counted with its order of multiplicity. Then as n tends to infinity, for 
any two real constants \ < ju, we have 


wu 
P{logn + AORN < Qn <logn-+ w/Togn} + | et /? dt. 
TT Sd 


This statement [210] is a counterpart of the famous Erdés—Kac Theorem (1940) for the number 
of prime divisors of natural numbers (with here log n that replaces log log n when dealing with 
integers at most n). A similar result holds for the parameter w,, that represents the number of 
distinct irreducible factors in a random polynomial of degree n. END OF EXAMPLE IX.21. 


It is perhaps instructive to re-examine this last example at an abstract level, in the 
light of general principles of analytic combinatorics. 


A polynomial over a finite field is determined by the sequence 
of its coefficients. Hence, the class of all polynomials, as a se- 
quence class, has a polar singularity. On the other hand, unique 
factorization entails that a polynomial is also a multiset of irre- 
ducible factors (“primes”). Thus, the class of irreducible poly- 
nomials, that is implicitly determined, is logarithmic, since the 
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multiset construction to be inverted is in essence an exponential 
operator. Consequently, the number of irreducible factors obeys 
the exponential-logarithmic scheme, so that it is asymptotically 
Gaussian. 


Eventually, the limit law arises because of the purely analytic character of the gener- 
ating functions involved, together with permanence of analytic relations implied by 
combinatorial constructions. 


EXAMPLE IX.22. Mapping patterns. Let f and g be two functions mapping the set 
{1,2,--- ,n} into itself. Mappings f and g are said to be equivalent if there exists a permutation 
mw of {1,2,--- ,n} such that f(¢) = 7 iff g(a(z)) = a(j). Mapping patterns are thus equiv- 
alence classes of mapping functions, or equivalently functional digraphs on unlabelled points. 
They correspond to multisets of cycles of rooted unlabelled trees. The OGF for rooted unla- 
belled trees satisfies the implicit relation A(z) = zexp(>> 4 A(z*)), and Otter [382] proved 


that 
A(z) = 1 = er —a/n) + Yo ce(1 — 2/n)*. 


k>2 


for some 7 < 1: see our detailed account in Chapter VII. 
On the other hand, by the translation of the cycle construction, if G is the unlabelled cycle 
construction applied to A, then (see Chapter IID, 


k 1 
Ot2) = SP hoe aay 


where ¢(k) is the Euler totient function. In the present context, since A(z) has radius of con- 
vergence 77 strictly less than 1, 


G(z) = log mo 


where S(z) is analytic at 7. Finally the bivariate OGF for random mapping patterns satisfies 


F(z,u) = exp og 


k>1 


= exp(ulog mo + uS(z) + T(z, u)) 
+ u((1—2z/n)'/*) + uS(z Z,u 
rt ul(a = fn") + uS(2) +70), 


where S(z) is analytic at 7, T(z, u) is analytic for z = 7 and u = 1, and H is analytic around 


+ S(z), 


= exp (J log 
= a 


0, with H(0) = 0. Thus conditions for applying Theorem IX.11 are satisfied and the number 
of components in random mapping patterns has a Gaussian limiting distribution. The mean 
value is asymptotic to s log n (this result appears in [357] and the variance is 3 logn + O(1). 
END OF EXAMPLE IX.22. 


EXAMPLE IX.23. Arithmetical semigroups. Knopfmacher [297] defines an arithmetical semi- 
group as a semigroup with unique factorization, and a size function (or degree) such that 


|zy| = || + |yl, 


where the number of elements of a fixed size is finite. If P is an arithmetical semigroup and Z 
its set of ‘primes’ (irreducible elements), axiom A*® of Knopfmacher asserts the condition 


card{x € P / |x| =n} =cq”+O(q°") (a <1). 
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It is shown by Knopfmacher that several algebraic structures forming arithmetical semigroups 
satisfy axiom A*, and thus the conditions of Theorem [X.11 are automatically satisfied. There- 
fore, the results deriving from Theorem IX.11 fit into the framework of Knopfmacher’s “ab- 
stract analytic number theory’, since they provide general conditions under which theorems of 
the Erdés—Kac type must hold true. Examples of application are Galois polynomial rings (the 
example of polynomial factorization), finite modules or semisimple finite algebras over a finite 
field K = GF‘(q), integral divisors in algebraic function fields, ideals in the principal order of 
a algebraic function field, finite modules, or semisimple finite algebras over a ring of integral 
TUNCHONS: Wee ahaa otha ee More eh tee ae faeces we END OF EXAMPLE IX.23. 


IX. 7.3. Algebraic and implicit functions. Many combinatorial problems, es- 
pecially as regards paths and trees, lead to descriptions by context-free languages. 
Accordingly, the GF’s are algebraic functions. The most frequent situation is that of 
univariate GF’s having singularities of the square-root type. 


Corollary IX.1 (Algebraic functions). Let F(z, wu) be a bivariate function that is an- 
alytic at (0,0) and has nonnegative coefficients. Assume that F'(z,u) is one of the 
solutions y of a polynomial equation 


B(z, U, y) = 0, 
where ® is an irreducible polynomial of total degree m, of degree d > 2 in y. Assume 


that F(z, 1) is has a unique dominant singularity at p > 0, with a singular behaviour 
of the square-root type there. Define the resultant polynomial, 


A(z, u) = result y Ge U,Y), 2 62,0, ») : 
y 


and assume that p is a simple root of A(z,1). Let p(u) be the unique root of the 
equation 


A(p(u), u), 
analytic at 1, such that p(1) = p. Then, provided the variability condition 


v (Ae) > 0, 


is satisfied, a Gaussian Limit Law holds for the coefficients of F(z, u). 


PROOF. The assumption of a square-root singularity (see Chapters VI and VID) 
means that the polynomial ®(p,1,y) has a double zero at y = 7, where 7 = 
lim,_,,- F(z, 1). Equivalently, we have 


(F000 1 ») =0 (0% 1 ») #0 
Oy race) _ e) Oy? a) — : 


Thus, Weierstrass preparation gives the local factorization 
D(z, u,y) = (y? + er(z,u)y + co(z,u))A(z,u,y), 


where H(z, u, y) is analytic and nonzero at (p, 1,7) while ci(z, u), co(z, u) are ana- 
lytic at (z, u) = (p,T). 
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From the solution of the quadratic equation, we must have locally 


y= 5 (-er(z,u) + Vci(z, u)? — 4ea(z, w)) ; 


Consider first (z, uw) restricted by 0 < z < pandO0 < u < 1. Since F(z, w) is real 
there, we must have c;(z, uw)? — 4c2(a, w) also real and nonnegative. Since F'(z, u) is 
continuous and increasing with z for fixed u, and since the discriminant c;(z,u)? — 
4c2(a,u) vanishes at 0, the determination with the minus sign has to be constantly 
taken. In summary, we have 


F(z,u) = 5 (-er(z,u) — Va(z,u)? — 4eo(2,u)) : 


The function C(z,u) = c?(z,u) — 4co(z,u) has a simple real zero at (p, 1). 
Thus there is locally a unique analytic branch of the solution to C(p(u), wu) = 0 such 
that p(1) = p.. This branch is also by necessity a root of the resultant equation 
A(p(u), uw) = 0. The conditions of Theorem IX.10 therefore apply and the Gaussian 
law follows. 


This theorem asserts that, under suitable conditions, the only possible dominant 
singularity of the BGF is a “lifting” of the singularity of the univariate GF F(z, 1) 
and the nature of the singularity —the square-root type— does not change. The result 
generalizes to the case of a function ® that is analytic in sufficiently large bounded 
domains, e.g., an entire function. The condition is that the analytic curves 


P(z,u,y) = 0, 5-22.14) =0 


have an intersection that “moves analytically” and nontrivially for u near 1, and a 
sufficient condition for this is the nonvanishing of the Jacobian determinant 


ZO(z,u,y) LO(z,u,y) 


2 2 
F7Ol(z,uy) FrO(z,u,y) 


(44) J(z,uU,y) = 


and its first derivative with respect to u at (p, 1,7), 


Pa 


(45) edjr)#0, ae 


#0. 


(p,1,7) 


In the case of Corollary [X.1 and of these extensions, the expansion of p(u) at u = 1, 
hence the mean and variance of the distribution, are computable explicitly from ®, its 
derivatives, and the quantities p and tr = F(p,1). 

The corollary applies to a great variety of decomposable parameters of context— 
free languages, tree like objects, and more generally many recursively defined com- 
binatorial types. Examples of parameters covered are leaves, node types, and various 
sorts of patterns in combinatorial tree models. Drmota has worked out a different 
set of conditions for asymptotic normality. In particular, one of Drmota’s important 
results [135] yields asymptotic normality, under minor technical restrictions, for a 
polynomial system with positive coefficients that is “irreducible”, meaning that the 
dependency graph between nonterminals is strongly connected. 
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> IX.34. Nodes of degree k in simple varieties of trees. Their distribution is asymptotically 
Gaussian. 


> IX.35. Leaves in nonplane unlabelled trees. Their distribution is asymptotically Gaussian. 


J 


IX. 7.4. Differential equations. Ordinary differential equations (ODE’s, for 
short) in one variable, when linear and with analytic coefficients, have solutions whose 
singularities occur at well-defined places, namely those that entail a reduction of order. 
The possible singular exponents of solutions are then obtained as roots of a polyno- 
mial equation, the indicial equation. Such ordinary differential equations are usually 
a reflection of a combinatorial decomposition and suitably parametrized versions then 
open access to a number of combinatorial parameters. In this case, the ODE normally 
remains an ODE in the main variable z that records size, while the auxiliary variable u 
only affects the coefficients but not the global shape of the original ODE. 

Three cases may then occur for a linear ODE parametrized by wu. 


e Movable singularity: the location of the dominant singularity p(u) changes 
with u but the singular exponent does not change; the analysis is then similar 
to that of algebraic-logarithmic singularities. 

e Movable exponent: the dominant singularity does not move but the sin- 
gular exponent a(w) changes; the analysis then resorts to the exponential- 
logarithmic schema. 

e Movable singularity and movable exponent: in this case, the singular be- 
haviour is essentially dictated by the movable singularity but with an aux- 
iliary contribution arising from the movable exponent; the analysis of this 
mixed case then requires an extension of the quasi-power framework, as 
developed by Gao in Richmond in [226]. 


Here, we focus on the important case of a fixed singularity and a movable exponent. 
The required singularity perturbation analysis is inspired by the treatment of Flajolet 
and Lafforgue in [195]. The corresponding univariate problems resort to holonomic 
asymptotics. 


Linear differential equations. The example of the distribution of levels of nodes 
in random binary search trees or heap—ordered trees illustrates well the situation of 
a fixed singularity and movable exponent. A heap—ordered tree (HOT) is a plane 
binary increasing tree. HOTs constitute an unambiguous tree representation of per- 
mutations [434]. The EGF of HOTs is 


1 gt 
F(z) = —— = | 
i) l-2z Dan nl? 


as results either from the combinatorial bijection with permutations or from the root 
decomposition of increasing trees that translates into the functional equation, 


(46) Pere | ” F(t) dt, 
0 


a Riccati equation in disguise. Let F(z, u) be the BGF of HOT’s where wu records the 


depth of external nodes. In other words, fp, = [2z”u“|F(z, u) is such that +f,n, k 
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represents the probability that a random external node in a random tree of size n is at 
depth & in a random tree. The probability space is then a product set of cardinality 
(n + 1)-n!, as there are n! trees each containing (n + 1) external nodes. By a 
standard equivalence principle, the quantities + finn, k also give the probability that a 
random unsuccessful search in a random binary search tree of size n necessitates k 
comparisons. 

Since the depth of a node is inherited from subtrees, the function F'(z, u) satisfies 
the linear integral equation, 


(47) F(z,u) =1+ 2u | F(t,u) — 
A = 


or, after differentiation, 
2u 


1l-—z 


0 
gif 4) = 


This equation is in fact a linear ODE with wu entering as a parameter, 
d 2u 

1l-—z 

The solution of any separable first-oder ODE is obtained by quadratures, here, 


ade. 
(2) 


F(z,u), F(0,u) =1. 


y(z)=0, — -y(0) = 0. 


F(z,u) = 
From singularity analysis, provided wu avoids {0, -4, —1,...}, we have 


folu) = PG) = a (1 i; o(2)) | 


and the error term is uniform in u provided, say, |u — 1| < i. Thus, Theorem [X.11 
applies, and the law with PGF f,,(u)/fn(1) converges to a Gaussian limit. 

A similar result holds for levels of internal nodes, and is proved by similar de- 
vices. The Gaussian profile is even perceptible on single instance (see the particular 
figure in Chapter III), which actually suggests a stronger “functional limit theorem” 
for these objects: this has been proved by Chauvin and Jabbour [84] using martingale 
theory. 

Naturally, explicit expressions are available in such a simple case, 

fn(u) =. 2u- (2u+1)---(Qu+n—-1) 


fr(1) (n+ 1)! 

so a direct proof of the Gaussian limit in the line of Goncharov’s theorem is clearly 
possible; see Mahmoud’s book [351, Ch. 2], for this result originally due to Louchard. 
What is interesting here is the fact that F(z, u) viewed as a function of z has a singu- 
larity at z = 1 that does not move and, in a way, originates in the combinatorics of 
the problem—the EGF (1 — z)~! of permutations. The auxiliary parameter u appears 
here directly in the exponent, so that the application of singularity analysis or of the 
more sophisticated Theorem IX.11 is immediate. 
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Corollary IX.2 (Linear differential equations). Let F(z, u) be a bivariate generating 
function with nonnegative coefficients that satisfies a linear differential equation 

OTF ay(z,u) 0" 1F a,(z, u) 
Oz” (p—2z) OzT-1 (p— 2)” 
with a;(z,u) analytic at p, and ag(p,1) # 0. Let fr(u) = [2z"| F(z, u), and assume 
the following conditions: 


F=0, 


ao(z, u) 


e [Nonconfluence] The indicial polynomial 


(48) J(a) = ao(p; 1)(a)(r) + 1p, 1)(@) (ry + +++ + ar(p, VD) 
has a unique root 0 > 0 which is simple and such that all other roots a # 0 
satisfy R(a) <0; 
e [Dominant growth] f,(1) ~ C- p~"n?—", for some C > 0. 
e [Variability condition] 


logn 
Then the coefficients of F(z, u) admit a limit Gaussian law. 


PROOF. (See the paper by Flajolet and Lafforgue [195] for a detailed example or the 
books by Henrici [265] and Wasow [490] for a general treatment of singularities of lin- 
ear ODEs.) We assume in this proof that no two roots of the indicial polynomial (48) 
differ by an integer. Consider first the univariate problem. A differential equation, 
dF  ay(z) dF eee ar(z) F=0, 

dz” — (p—2z) dz?! (op — z)" 

with the a;(z) analytic at p and ai(p) # 0 has a basis of local singular solutions 
obtained by substituting (o — z)~% and cancelling the terms of maximum order of 
growth. The candidate exponents are thus roots of the indicial equation, 


I(a) = a0(p)(@)(r) + 41 (p)(@) (ray + +++ + Ar (p) = 0. 


If there is a unique (simple) root of maximum real part, a1, then there exists a solution 
to (49) of the form 


(49) ao(z) 


Yilz) = (9-2) “hip — 2), 
where hi(w) is analytic at 0 and h1(0) = 1. (This results easily from a solution by 
indeterminate coefficients.) All other solutions are then of smaller growth and of the 
form 
¥j(z) = (p — 2) “hy (p — z) (log(z — p)), 
for some integers k; and some functions h;(w) analytic and nonzero at w = 0. Then, 
F(z) has the form 


F(z) = 32 6)¥j(2). 
j=l 
Then, provided c, 4 0, 


=: roe + 0(1)). 


636 IX. MULTIVARIATE ASYMPTOTICS 


Under the assumptions of the theorem, we must have o = a1, and c; # 0. The reality 
assumption is natural for a series F(z) that has real coefficients. 
When wu varies in a neighbourhood of 1, we have a uniform expansion 


(50) F(z,u) = e1(u)(o — z)-°™ Fi (o — z,u)(1+ o(1)), 


for some bivariate analytic function H)(w,u) with H,(0,u) = 1, where o(u) is the 
algebraic branch that is a root of 


J(a, u) = ao(p, u)(@)(r) ac ai(p, u)(@)(r—1) Teen ar(p, u) = 0, 


and coincides with o at u = 1. By singularity analysis, this entails 


c 
(51) [2"|F(z,u) = Le + 0(1)), 
uniformly for u in a small neighbourhood of 1, with the error term being O(n~“*) for 
some a > 0. Thus Theorem IX.11 applies and the limit law is Gaussian. 

The crucial point in (50,51) is the uniform character of expansions with respect 
to u. This results from two facts: (i) the solution to (49) may be specified by analytic 
conditions at a point zp such that z) < p and there are no singularities of the equation 
between 2 and p. (27) there is a suitable set of solutions with an analytic component 
in z and wu and singular parts of the form (p — zu), as results from the matrix 
theory of differential systems and majorant series. (This last point is easily verified if 
no two roots of the indicial equation differ by an integer; otherwise, see [195] for an 
alternative basis of solutions for u near 1, u # 1.) 


EXAMPLE IX.24. Node levels in quadtrees. This example is taken from [195]. Quadtrees are 
one of the most versatile data structure for managing a collection of points in multidimensional 
space. They are based on a recursive decomposition similar to that of BSTs. 

Here d is the dimension of the data space. Let f;,,;, be the number of external nodes at level 
k ina quadtree of size n grown by random insertions, and let F'(z, u) be the corresponding BGF. 
Two integral operators play an essential réle, 


t9@)= fo, Ioe= f[ oy: 


The basic equation that reflects the recursive splitting process of quadtrees is then 
(52) F(z,u) =1+2¢uJd* "1 F(z,u). 
The integral equation (52) satisfied by F then transforms into a differential equation of order d, 
I-13) ~4 F(z, u) = 2¢uF(z,u), 
where 
I'g(z)=(1—2)9'(z), I *g(z) = 2(1— 2) 9"(2). 


The linear ODE version of (52) has an indicial polynomial that is easily determined by 
examination of the reduced form of the ODE (52) at z = 1. There, one has 


J" g(z) =W"g(z) — (2 -1)?g' (2) & (1— 2)9'(2)- 
Thus, 
I-*5*-4(1 — z)~* = 04 (1 — 2)? + O((1— z)**), 
and the indicial polynomial is 
J(a,u) = at — 24u. 
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In the univariate case, the root of largest real part is a1 = 2; in the bivariate case, we have 
ai(u) = 2u'/?, 
where the principal branch is chosen. Thus, 
fr(u) = y(u)n™ (1 + 0(1)). 


By the combinatorial origin of the problem, F(z, 1) = (1 — z)~, so that the coefficient (1) 
is nonzero. Thus, the conditions of the corollary are satisfied. The law is Gaussian in the limit, 
with mean and variance 


275 n a Zig n 


The same result applies to the cost of a random search, either successful or not, as shown 
in [195] by an easy combinatorial argument. ................. END OF EXAMPLE IX.24. 


Nonlinear differential equations. Though nonlinear differential equations do not 
obey a simple classification of singularities, there are a few examples in analytic com- 
binatorics that can be treated by singularity perturbation methods. We detail here 
typical analysis of properties of binary search trees (BSTs), equivalently HOTs, that is 
taken from [185]. The Riccati equation involved reduces, by classical techniques, to 
a linear second order equation whose perturbation analysis is particularly transparent 
and akin to earlier analyses of ODEs. In this problem, the auxiliary parameter induces 
a movable singularity that directly resorts to the Quasi-Powers Theorem. 


EXAMPLEIX.25. Paging of binary search trees. Fix a “bucket size” parameter b > 2. Given 

a binary search tree t, its b-index is a tree that is constructed by retaining only those internal 

nodes of t which correspond to subtrees of size > b. As a data structure, such an index is well- 

suited to “paging”, where one has a two-level hierarchical memory structure: the index resides 

in main memory and the rest of the tree is kept in pages of capacity b on peripheral storage, see 

for instance [351]. We let v[¢] = vy [t] denote the size —number of nodes— of the b-index of t. 
Like in Eq. (46), the bivariate generating function 


F(z,u):= So Aa lz! 


satisfies a Riccati equation that reflects the root decomposition of trees, 


ee oe = uF?(z,u) + (1— Pe ( 


F(0,u) =1 
2 f J. Fu) =1, 
where the general quadratic relation (46) has to be corrected in its low order terms. 
The GFs of moments are rational functions with a denominator that is a power of (1 — z), 
as results from differentiation at u = 1. Mean and variance follow: 


_2m+1)_, 92 _ 2 (6-1) +1) 


De. par e362) 
(The result for the mean is well-known, refer to quantity A,, in the analysis of quicksort on 
p. 122 of [302].) 
Multiplying both sides of (53) by u now gives an equation satisfied by H(z,u) := 
uF’ (z,u), 


1 got 


(53) a 


(n +1). 


O _ 2 d y— 2° 
a, tlw) =H (z,u) +u(1 — u)a— ae ; 
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that may as well be taken as a starting point since H(z, u) is the bivariate GF of parameter 1+ vy 
(a quantity also equal to the number of external pages). The classical linearization transforma- 
tion of Riccati equations, 
Xi (z,u) 
H — z ? 
(z.u) = SE, 


yields 


9 d /1— 2+ 
(54) S5X(z,u) +uw-A@X(z,u) =0, AG) = (4) 


with X (0, vu) = 1, X1(0,u) = —u. By the classical existence theorem of Cauchy, the solution 
of (54) is an entire function of z for each fixed wu, as the linear differential equation has no 
singularity at a finite distance. Furthermore, the dependency of X on wu is also everywhere 
analytic; see the remarks of [490, Sec. 24], for which a proof derives by inspection of the 
classical existence proof based on indeterminate coefficients and majorant series. Thus, X (z, u) 
is actually an entire function of both complex variables z and u. As a consequence, for any fixed 
u = uo, the function H(z, uo) is a meromorphic function of z whose coefficients are amenable 
to singularity analysis. 

In order to proceed further, we need to prove that, in a sufficiently small neighbourhood of 
u = 1, X(z,u) has only one simple root, corresponding for H(z, u) to a unique dominant and 
simple pole. This fact itself derives from general considerations surrounding the Preparation 
Theorem of Weierstrass: in the vicinity of any point (zo, uo) with X(zo,uo) = 0, the roots 
of the bivariate analytic equation X(z,u) = 0 are locally branches of an algebraic function. 
Here, we have X(z, 1) = 1 — z. Thus, as wu tends to 1, all solutions of X(z, wu) must escape to 
infinity except for one branch p(w) that satisfies p(1) = 1. By the nonvanishing of X{,(z, 1) and 
the implicit function theorem, the function p(w) is additionally an analytic function about u = 
1. 

The argument is now complete: for wu in a sufficiently complex neighbourhood of 1, we 
have a Quasi-Powers approximation, 


[2"]H(z,u) = plu)" (1+ O(K™")), 


for some fixed constant kK > 0. The Gaussian limit results. ... END OF EXAMPLE IX.25. 


As shown in [185], a similar analysis applies to patterns in binary search trees 
and heap—ordered trees. This is related to the analysis of local order patterns in per- 
mutations, for which gaussian limit laws have been obtained by Devroye [125] using 
extensions of the central limit theorem to weakly dependent random variables. 

Similar displacements of singularity arise for node types in varieties of increasing 
trees, extending the case of HOTs that are binary. This is discussed in [40]. For 
instance, if @(w) is the degree generator a family of increasing trees, the nonlinear 
ODE satisfied by the BGF of leaves is 
2 F(z,u) = (u~ 1)9(0) + dF (2,). 

Whenever ¢ is a polynomial, there is a spontaneous singularity at some p(w) that 
depends analytically on u. Thus, again the Quasi-Powers Theorem applies; see [40]. 


IX. 8. Perturbation of saddle point asymptotics 


We shall be brief here, as the subject is excellently covered in Sachkov’s book to 
which we refer for details. Entire functions and functions with a fast growth at their 
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singularity do not in general lead to quasi-power expansions. As we known from uni- 
variate asymptotics (Chapter VIII), the coefficient expansions involve a combination 
of large powers (that arise from the Cauchy kernel) and of the very fast singular be- 
haviour of the function under consideration. Accordingly, bivariate asymptotic studies 
necessitate a perturbation of saddle point expansions. A framework more flexible than 
the Quasi-Powers Theorem is then needed. 

Here, we base our brief discussion on a theorem taken from Sachkov’s book [422]. 
Theorem IX.12 (Generalized quasi-powers). Assume that the generating function 
Pn(u) of a discrete random variable X,, has a representation of the form 


Pn(u) = exp (hn(u)) 1 + (1), 
that holds uniformly, where each h,,(w) is analytic in a fixed neighbourhood Q of 1. 
Assume also the condition, 
hn (u) 

(hy, (1) + hy (1))9/? 
uniformly for wu € Q. Then, the random variable 

2 An=h,(1) 

" (A,) + hg (1)? 
converges in distribution to a normal law with parameters (0, 1). 


PROOF. See [422, Sec. 1.4] for details. Set 0? = h/,(1) + h’(1), and expand the 
Laplace transform of X,, at t/o. This gives 


(55) — 0, 


2 


hn(et/) = Wi(1)— + (Wi, (1) + REC) 5= + 0(1). 


Thus, the Laplace transform of X; converges to the transform of a standard Gauss- 
ian. 


This theorem extends the quasi-power scheme. In effect, if 
hn(u) = Bn log B(u) + A(u), 


then the quantity (55) is O(G, ty a) uniformly. The application of this theorem to 
saddle point integrals is in principle routine, though the manipulation of asymptotic 
scales associated with expressions involving the saddle point value may become cum- 
bersome. We detail here the case of singletons in random involutions for which the 
saddle point is an algebraic function of n and wu. 


> IX.36. Effective speed bounds. A metric version of the theorem, with error terms, cane be 
developed assuming suitable error bounds. <q 


EXAMPLE IX.26. Singletons in random involutions. This example is again borrowed from 
Sachkov’s book [422]. The BGF is 


2 
F(z,u) = exp (<u + =) : 
The saddle point equation (see Chapter VIID) is then 


d 2 
(fu aa fe (n+ 1) log) =0. 
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This defines the saddle point ¢ = ¢(n, wu), 


C(nju) = -~+=V4n4+4+4u? 


| 
a 


where the error term is uniform for u near 1. By the saddle point formula, one has 
n 1 —n 
2") F (2, u) = ———F'(C(n, wu), u)C(n, wu)". 
27 D(n, u) 


The denominator is determined in terms of second derivatives, according to the classical saddle 
point formula (Chapter VIID, 


Oo? oO 2 
a 2 — —] —_ 
D(n,u) = (: re +255 E + 5 ).. 


and its main asymptotic order does not change when wu varies in a sufficiently small neighbour- 
hood of 1, 


D(n,u) = 2n-u/n + O(1), 
again uniformly. Thus, the PGF of the number of singleton cycles satisfies 
F(C(n,u), uv) (e3) ne 
Pn(u) = 1+4+o(1)), 
= FADD em) ©) 
uniformly, for wu near 1. This is of the form 
Pn(u) = exp (hn(u)) (1 + 0(1)), 


and local expansions then yield the centering constants 


an = hi(1) = Vn-~5+O(n¥), BR = i) + AN(1) = Vn 14 O(n 1), 


The theorem applies directly to this case and the variable 


is asymptotic to a standard normal. 

A little care with the error terms in the asymptotic expansions shows that the mean and 
standard deviation fin, On are asymptotic to dn, bn, respectively. Therefore, the number of 
singletons in a random involution of size n has mean jz, and standard deviation 07, that satisfy 


1/2 /4 


1 
Unwn’”, Onw~n 


This computation also determines the law of doubleton cycles and of all cycles, that are given 
by 


(n— Xn), s(n + Xn); 


NlR 


respectively. In particular, the number of doubleton cycles has average 5n - 


dnl 2. Thus, a random involution has a relatively small number of singleton cycles. 
END OF EXAMPLE IX.26. 
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EXAMPLEIX.27. The Stirling partition numbers. The numbers eal correspond to the BGF 
F(z,u) = exp (u(e* — 1)). 
The saddle point ¢ = ¢(n, w) is the positive root near n/ log n of the equation 


Ces = ore 


U 


The theorem applies: 
Proposition IX.14. The Stirling partition distribution defined by = {3}, with S,, a Bell num- 
ber, is asymptotically normal, with mean and variance that satisfy 
n 2 n 
~ : on ¥ > 
logn (log n)? 


Lon, 


We refer once more to  Sachkov’s book for computational details. 
END OF EXAMPLE IX.27. 


Summarizing the last example as well as earlier results, we now have the fact that 
all four Stirling-related distributions, 


1[n k! [n 1 (n k! (n 
milk]? Onlkl? SalkJ’ Ralks? 


associated to permutations, alignments, set partitions, and surjections are asymptoti- 
cally Gaussian. 


Saddle point and functional equations. The average-case analysis of the number of 
nodes in random digital trees or “tries” can be carried out using the Mellin transform 
technology. The corresponding distributional analysis is appreciably harder and due 
to Jacquet and Régnier [278]. A complete description is offered in Section 5.4 of 
Mahmoud’s book which we follow. What is required is to analyse the BGF 


F(z,u) =e*T(z,u), 


where the Poisson generating function T(z, wu) satisfies the nonlinear difference equa- 
tion, 
T(z,u) = uP*(E, u)+(1—u)(1+z)e". 

This equation is a direct reflection of the problem specification. At u = 1, one has 
T(z, 1) =1, F(z, 1) =e’. The idea is thus to analyse [z”] F(z, u) by the saddle point 
method. 

The saddle point analysis of F' requires asymptotic information on T(z, wu) for 
u = e* (the original treatment of [278] is based on characteristic functions). There, 
the main idea is to “quais-linearize” the problem, setting 


L(z,u) = log T(z, u), 


with uw a parameter. This function satisfies the approximate relation L(z,u) ~ 
2L(z/2,u), and a bootstrapping argument shows that, in suitable regions of the com- 
plex plane, L(z,u) = O(|z|), uniformly with respect to u. The function L(z, w) is 
then expanded with respect to u = e” at u = 1, i.e., t = 0, using a Taylor expansion, 
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its companion integral representation, and the bootstrapping bounds. The moment- 
like quantities, 

a it 

ane (z, e ) = ’ 

can be subjected to Mellin analysis for 7 = 1,2 and bounded for 7 > 3. In this way, 
there results that 


L;(z) = 


LG; e) = Ta e)t+ shal2)t + O(zt°), 


uniformly. The Gaussian law under a Poisson model immediately results from the 
continuity theorem of characteristic functions. Under the original Bernoulli model, 
the Gaussian limit follows from a saddle point analysis of 


F(z, e’*) = e%ehlae"), 


An even more delicate analysis has been carried out by Jacquet and Szpankowski 
in [279]. It is relative to path length in digital search trees and involves the formidable 
non-linear bivariate difference-differential equation 


) ei? 


IX. 9. Local limit laws 


,u). 


Under conditions similar to those of the Quasi-Powers Theorem, a cluster of con- 
clusions may be drawn regarding densities of distributions and probabilities of large 
deviations from the mean. We examine here the occurrence of local limit laws, which 
corresponds to convergence of a discrete probability distribution to the Gaussian den- 
sity function rather than convergence of distribution functions to the Gaussian error 
function, as we have seen so far. Such local laws hold very frequently, but their proofs 
require some sort of additional “smoothness” assumptions, either a combinatorial or 
analytic. Under assumptions of the Quasi-Powers Theorem, it is also possible to quan- 
tify precisely the exponential rate of decay for probabilities of rare events, far away 
from the center of the distribution. This section explores both aspects that fit well 
withing the general framework of quasi-powers. One aspects provides precise asymp- 
totic information on values of the individual probabilities, especially near the mean; 
the other aspect quantifies the smallness of probabilities far away from the mean and, 
when conditions apply, it provides sharp quantitative versions of the concentration of 
distribution discussed at the beginning of this chapter. 

So far, we have examined the occurrence of continuous limit laws in the sense of 
convergence of distribution functions. Thus, a standardized Y,, converges in distribu- 
tion to Y, if 

P{Y, <a} > P{Y < z}. 
In the case of a Gaussian limit that arises from a sequence of discrete distributions of 
variables X,, with mean and variance ju, 02, such a property quantifies the probabil- 
ities over any nonempty interval scaled according to o,, 


1 b 
(56) Pr{ pin + aon < Xn < pn + bon} = all en? /2 dy + o(1), 
T Sa 
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FIGURE IX.11. The histogram of the Eulerian distribution scaled to (n + 1) on 
the horizontal axis, forn = 3..60. The distribution is seen to quickly converge 
to a bell-shaped curve corresponding to the Gaussian density e~*”/? /(2r)!/2. 


for any a,b with a < b. From there, it is however in general not possible to draw 
information on any individual probability, 


Pn,k = P{X,, = k}, 


by differencing, since the error terms in (56) will usually hide any nontrivial asymp- 
totic information on individual p», x. 

On the other hand, numerical examination of discrete probability distributions 
reveals that the histograms of the p,,,, often assume a bell-shape profile in the asymp- 
totic limit. For instance Figure 11, borrowed from our book [434], displays the p,. x 
that correspond to the Eulerian numbers. For a given value of n, the maximum proba- 
bility py,x 1s seen to occur “in the middle”, near the mean, and to obey an approximate 
law, 

1.35 


P2nn ~ Van’ 
for values near n = 60. The standard deviation of the distribution is otherwise known 
to be ~ \/n/12. Thus, the we expect an approximate formula of the form 


2/2 


C o-« 
Pr n/2+0./n/12 ee Vn , 
for integral values of the argument k = n/2 + x2,/n/12, with some constant C about 
1.35. 


Definition IX.4. A sequence of discrete probability distributions, pp, = P{Xn = 
k}, with mean jy, and standard deviation oy, is said to obey a local limit law of the 
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Gaussian type if, for some set S' of real numbers, and a sequence € — 0, 


2 
6 oI ee: 


SUP JOnPn,|untron| — Jon 


The local limit law is said to hold on S and the law is said to hold with relative speed 
of convergence €,,. 

When such a local limit law exists, it usually holds on arbitrary bounded intervals 
of the real line. 
Theorem IX.13 (Local limit law). Let X,, be a sequence of nonnegative discrete 
random variables with probability generating function p,,(u). Assume that uniformly 
in an annulus, 


l—-e<u<l+e, e>0 
the PGFs satisfy 
1 
67) pa(u) = A(w) (B(w))® (14 0(S)). 


where A(u), B(u) are analytic in the annulus and A(1) = B(1) = 1, v(B(u)) = 
B"(1)+ B’(1) — B’(1) £ 0. Assume also that B(u) attains uniquely in maximum on 
|u| = Lat u=1: forall v, with |v| = 1 and v # 1, one has |B(v)| <1. 

Under these conditions, the distribution of Xp, satisfies a local limit law of the 
Gaussian type on arbitrary bounded intervals of the real line. 


Note that the mean and variance of X,, are given by Eq. (20). 
PRooF. A direct application of the saddle point method, as developed in Chap- 
ter VIII. 
This theorem applies in particular to quasi-power expansions, whenever the dom- 
inant singularity p(w), that is a perturbation of the dominant singularity p of the uni- 
variate problem, is analytic at all points of |u| = 1 and uniquely attains its minimum 
atu=1. 


EXAMPLE IX.28. Local laws for sums of RV’s. The simplest application is to the binomial 
distribution, for which 


1 
BGS = on 
In a precise technical sense, the local limit arises in the BGF, 
L 
F SS 
i) = T= eae 


because the dominant singularity p(u) = 2/(1+ u) exists on the whole of the unit circle, |u| = 
1, and it attains uniquely its minimum modulus at wu = 1; accordingly, B(u) = p(1)/p(w) is 
uniquely maximal at u = 1. 

More generally, the theorem applies to any sum S, = 7; + --- + TJ), of independent, 
identically, random variables whose maximal span is equal to 1 and whose PGF is analytic on 
the unit circle. In that case, the BGF is 


F(z,u) = 


the PGF of S;,, is a pure power, 
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FIGURE IX.12. The values of the function B(u) for the Eulerian distribution 
when |u| = 1, represented by a polar plot of |B(e’’)| on the ray of angle 0 
(right). (The dashed contours represent the relevant parts of the unit circle, for 
comparison.) The maximum is uniquely attained at u = 1, where B(1) = 1. This 
entails a local limit law for the Eulerian distribution. 


and the fact that the minimal span of the X; is 1 entails that B(w) attains uniquely its maximum 
at 1. Such cases have been known for a long time in probability theory. See Chapter 9 of [237]. 
END OF EXAMPLE IX.28. 


At this stage, it is worth pointing an example not leading to a local law. Consider 
the binomial distribution restricted to even values, 


—2fn _¢ 
Pn,2k = an \ op? Pn,2k+1 = V. 


The BGF is 


1 1 
BG) Ta Te 


This has two poles, 


_ 2 (aes 2 
~ 144" aed ame EE 


pi(u) 


and it is clearly not true that a single one dominates throughout the domain |u| = 1. 
Accordingly, the PGF satisfies 


Pn(u) =(1+u)"+ 1-1)", 


and no quasi-power law, with a unique analytic B(u), holds uniformly for wu on the 
unit circle. In essence, a local limit law will be likely to hold when a PGF has a sharp 
peak near | and stays much smaller in modulus along the rest of the unit circle. In 
contrast, for the even binomial distribution, one has p,(1) = pp(—1). 
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EXAMPLE IX.29. Local law for the Eulerian distribution. For Eulerian numbers, we have 
derived the approximate expression, 


Pn(u) = B(u) "' + O(2-"), 
when wu is close enough to 1, with 


u-l1 


B(u) = p(u)* 


a logu’ 


The plot of the function B(u) when wu varies over |u| = 1 is then displayed in Fig. 12. 

This case requires in fact a minor extension of Theorem IX.13 since the principal deter- 
mination of the logarithm cannot be extended to the whole of the unit circle, in particular at 
u = —1. However, it is easily realized that the quasi-power expansion holds with the pos- 
sible exception of a small segment of the integration contour near u = —1. However, there, 
the integrand is anyway exponentially smaller than on the rest of the contour, and the proof of 
Theorem [X.13 is easily adjusted to cover such case. 

From this enhanced argument, there results that a local limit law of the Gauss- 
ian type holds for the Eulerian distribution on any compact subset of the real line. 
END OF EXAMPLE IX.29. 


With a similar care to be exercised regarding principal determinations and dom- 
inant singularities, many of our earlier analyses can be turned into local limit laws. 
What is needed is a dominant singularity p(w) that yields the main asymptotic form of 
the PGF’s on most of the unit disc and that achieves uniquely its minimum at 1, while 
the rest of the unit disc contributes negligibly. For instance, this covers the surjection 
distribution, for which 


log 2 


p(w) = log(1+u™*), Blu) = legen)? 


leaves in general Catalan trees, where 


1 2 
B(u) = el 
4 
or in binary Catalan trees. 
The Stirling cycle distribution satisfies 


PDr(u) = a (1 + a) 


This approximation remains uniform as long as u avoids —1, but, there, p,, (uw) is small 
anyway (being O(n~7)), so that again an extended form of Theorem IX.13 applies 
and a local limit law holds. The same argument applies to node levels in quadtrees of 
Example 24. 

> IX.37. Peaks of distributions. It is possible to analyse asymptotically in detail the values of 


the peak of the Eulerian and Stirling cycle distributions. (For the Eulerian distribution, see, e.g., 
the study of Lesieur and Nicolas [330].) J 
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FIGURE IX.13. The quantities log p,, relative to the Eulerian numbers illus- 
trate an extremely fast decay of the distribution away from the mean. Here, the 
diagrams corresponding to n = 10, 20,30, 40 (top to bottom) are plotted. The 
common shape of the curves indicates a large deviation property. 


IX. 10. Large deviations 


Moment inequalities constrain the shape of a distribution given its mean and vari- 
ance. In particular, if o,/~, — 1, the concentration property holds. This property 
comes from Chebyshev’s inequality according to which the probability of observing 
a value that deviates by more than x standard deviations from the mean is O(x~?). 
Such general bounds, though sufficient to establish a concentration property, are much 
weaker than what holds under conditions of the quasi-power type, where the probabil- 
ities of deviation are in fact exponentially decreasing with in x. 

Figure 13 displays the logarithms of the Eulerian distribution. As logarithms of 
probabilities are plotted, the distribution is seen to decay very rapidly away from the 
mean jt, ~ n/2. Consider for instance extreme cases. Clearly, there is a unique 
permutation that has a minimal number of rises, namely the fully sorted permutation 
with probability 


Pn = a 
nr: 


In contrast, since 4, ~ n/2 and a2 ~ n/12, this extreme case is roughly at « = V3n 
from the mean; thus, the Chebyshev inequalities only provides the very weak upper 
bound of ~ — for this extreme case. For n = 40, the Chebyshev upper bound on the 
probability is thus about 0.008 while the exact value 1/40! is of the order of 10~*°. 
Extensions of the quasi-power framework are once more well-suited to prove such 
exponentially small tails, as we now explain. It turns out that the ubiquitous functions 
p(u), B(u) are directly related to large deviation estimates. Such estimates nicely 


supplement the already known limit laws, either central or local. 


Definition IX.5. A sequence of discrete random variables { X,,} with py,_ = P{Xn = 
k}, satisfies a local large deviation property of type (Bn, W(x)) over the interval 
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[xo, x1], if for any x € [xo, X14], 
1 _ 
(58) J, 08 Pnean < W(e) + O(n"). 
nm 


The function W (x) is called a large deviation function and (3, is the scaling factor. 


The inequality (58) is a priori only meaningful if 7, is an integer, but it makes 
sense as well if it is understood that p,,., = 0 for nonintegral values of w and log 0 = 
—oo. Of course, the large deviation property is nontrivial only when W(x) < 0, with 
W (a) not identically 0. A global (and marginally stronger) form of large deviations 
can also be defined when local probabilities are replaced by corresponding values of 
the cumulative distribution function. Large deviation theory is introduced nicely in 
the book of den Hollander [119]. 


Theorem IX.14 (Quasi-powers, large deviations). Consider a sequence of dis- 
crete random variables {X,,} with PGF py(u). Assume that there exist a func- 
tions A(u), B(u), analytic in some interval [uo, ui] withO < up < 1 < u4, such 
that a quasi-power expansion holds, 


(59) Dn(u) = A(u)B(u)9 (1+ O(K;*), 
uniformly. Then the X,, satisfy a large deviation property, 
1 

(60) F, O8Pnes, < W(x) + O(8,"), 
where the large deviation function W (x) is given by 

B 
(61) W(z) = min ioe ( o). 

u€[uo,us] ut 


PROOF. The basic observation is that if f(u) = >>, fru” is an analytic function with 
nonnegative coefficients, then, for positive u, 


u 
(62) fr = [u*|f(u) < aes min 0). 
The first inequality holds for any positive u in the disc of analyticity of f(w); the 
second bound, with a similar condition, consists in taking the best possible value of wu. 
See our earlier discussion of saddle point bounds. 
The combination of the principle (62) applied to f(w) = p,(u), and of the as- 
sumption of the theorem (59) yields 


Bu) 


Ue 


log Pn,xBy, < Pn min Jos ( ) + O(1). 


u€ [uo ,u1 


Thus, a large deviation property holds with W (x) given by (61). 
In general, the function W(x) is computable from B(w) and its derivatives. The 
minimum is attained at either an end-point or a point such that 


d (log B(u) — x log wu) = 0. 


U 


IX. 10. LARGE DEVIATIONS 649 


FIGURE IX.14. The large deviation function relative to the Eulerian distribution, 
for u € [0.3, 0.7]. 


Let n(x) be a value of u € [uo, us] that cancels this derivative. Thus, 77 is an inverse 
function of uB’(u)/B(u), 


Then, a large deviation function is 


(63) W(x) = log B(n(x)) — «log n(z). 
>> IX.38. Prove similar types of bounds for the cumulative quantities 
Prk = Sts Qn,k — Spee: 
J<k j2k 


<J 


EXAMPLE IX.30. Large deviations for the Eulerian distribution. In this case, the BGF has 
a unique dominant singularity for w with e < u < 1/e, and any « > 0. Thus, there is a 
quasi-power expansion with 

(u=1) 


aes log u 


’ 


on any interval [e, 1/¢]. Then 7(x) is computable as the inverse function of 


ii U 1 


u—1 logu’ 
This function increases from 0 to | as u increases from 0 to 1, so that the inverse function is 
well defined over any closed interval [e, 1 — €]. The function W () is then determined by (63); 
see Figure 14 for a plot of W (a) that “explains” the data of Figure 13. 

We find that 


W (0.3) = W(0.7) = —0.252, W(0.4) = W(0.6) = —0.061, 
W (0.45) = W(0.55) = —0.015, 


and W (0.5) = 0, as expected. For instance, the probability of deviating by 20% from the mean 


value Un ~ 0.5n is approximately exp(—0.061 n). For n = 100, this upper bound is about 


e °-°7 while the exact value of the probability gives pioo,60 = e~*°°®. In the same vein, there 
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is probability less than 10~° of deviating by 10% from the mean, when n = 1, 000; the upper 
bound becomes less than 10~®, for n = 10, 000, less than 10~%3, for n = 100, 000. (These 
are the estimates stated at the very beginning of this chapter.) .. END OF EXAMPLE IX.30. 


> IX.39. Quasi-Powers and large deviations. Under the Quasi-Powers assumption, it is usually 
possible to convert the upperbound into an equality. This has been done by Hwang [272, 273, 
274], who bases himself on a technique of Cramér. Roughly, by shiting the mean, the main 
Quasi-Powers Theorem can be applied at some u = uo with uo # 1. <q 


IX. 11. Non-Gaussian continuous limits 


Previous sections of this chapter have developed two basic paradigms for bivariate 
asymptotics (see also Figure 3): 


— a “minor” singularity perturbation mode leading to discrete laws, 
— a “major” singularity perturbation mode leading to continuous laws. 


However, in both cases, the assumption has been made so far that the collection of sin- 
gular expansions parameterized by the auxiliary variable all belong to a common ana- 
lytic class and exhibit no sharp discontinuity when the secondary parameter traverses 
the value wu = 1. In this section we briefly explore by means of examples the way dis- 
continuities in singular behaviour induce no-Gaussian laws (Subsection IX. 11.1), then 
conclude with a fairly general discussion of the critical composition schema (Subsec- 
tion IX. 11.2), thereby completing the classification of analytic composition schemes. 
The discontuities observed in the cases discussed here are reminiscent of what is 
known as phase transition phenomena in statistical physics, and we find it suggestive 
to borrow this terminology here. 


IX. 11.1. Phase transition diagrams. Perhaps the simplest case of discontinuity 
in singular behaviour is the already discussed BGF, 


1 
(1—z)(1— zu)’ 
where u records the number of a’s in a random word of a*b*. The limit law is clearly 
the continuous uniform distribution over the interval [0,1]. From the point of view of 


the singular structure of F'(z, uw), as a function of z, three distinct cases arise depending 
on the values of wu: 


F(z,u) = 


e u< 1: simple pole at p(w) = 1 
— i: 
1 


e u= 1: double pole at p(1 
e u> 1: simple pole at p(u 


ee 


Thus both the singularity location at p(w) and the singular exponent a(w) experience 
a nonanalytic transition at uw = 1. This arises from a “confluence” of two singular 
terms when u = 1. 

To visualize such cases, it is useful to introduce a simplified diagram representa- 
tion, called a phase transition diagram and defined as follows. Write Z = p(u) — z 
and reduce the singular expansion to its dominant singular term 7°“), Then, the 
diagram representing F(z, u) above is 
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x 


FIGURE IX.15. Histograms of the distribution of the maximum of a random 
walk for n = 10..60 (left) and the density of the arcsine law (right). 


u=l-e w=l1 u=1+e 


puy=1 pl)=1 plu) =1/u 
a Z 


A complete classification of such confluences and discontinuities is still lack- 
ing (see however Marianne Durand’s thesis [142] for interesting fragments), and is 
perhaps beyond reach given the vast diversity of situations encountered in a combina- 
torialist’s practice. 


EXAMPLEIX.31. Arcsine law for unbiased random walks. This problem is studied in detail 
by Feller [161, p. 94] who notes: “Contrary to intuition, the maximum accumulated gain is 
much more likely to occur towards the very beginning or the very end of a coin-tossing game 
than somewhere in the middle.” See Figure 15. In fact, if X,, is the time of the first occurrence 
of the maximum in a random game (walk with +1 steps) of duration n, one has 


P{Xn < an} ~ 2 arcsin V2, 
7 


a distribution function with density 


1 


m/z(1— 2) 


The BGF results from the standard decomposition of positive walks. Roughly, there is a 
sequence of steps ascending to the (nonnegative) maximum accompanied by “arches” (the left 
factor) followed by an excursion below than back to the maximum, followed by a sequence of 
descending steps with their companion arches. This translates directly into an equation satisfied 


f(x) = 


652 IX. MULTIVARIATE ASYMPTOTICS 


0.6 0.8 1 1.2 


TITT TTT ttt ttt rrr rit rrr t TIT T TTT Th 0.46 


0144 0.42 0.44 0.46 0.48 


Zz 


FIGURE IX.16. A plot of 1/F(z,u) for z € [0.4,0.55] when wu is assigned 
values between 4 and 2 (left). The exponent function a(w) and the singular value 
p(u) for u € [1/2, 3/2] (right). 


by the BGF F(z, u) of the location of the first maximum. 


1 1 
64 F ———__—_ . D : 
ae (2,4) 1 — zuD(zu) (2) 1—2zD(z)’ 
which involves the GF of gambler’s ruin sequences (Example 6), 
= _ ae 
Oe ee 
z 


In such a simple case, explicit expressions are available from (64), as it suffices to expand first 
with respect to u, then to z. We obtain in this way the ultra-classical result: 


2v 
v 


Proposition IX.15 (Are sine law). Set uz, := 277” (°”). The probability that the first maxi- 
mum in a random walk of length n = 2v occurs atk = 2p ork = 2p 4+ 1is 5 U2pU2v—2, for 
0<k < 2v. Forany x € (0,1), the position Tn of the first maximum satisfies 


lim P,(Xn < xn) = 2 arcsin \/2. 
T 


noo 


(The asymptotic form reflects by summation that of uz, since ua, ~ (mnu)—/ 2) 
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It is instructive to compare this to the way singularities evolve as u crosses the value 1. 
The dominant positive singularity is at p(w) = 1/2 if wu < 1, while p(u) = 1/(2u), if u > 1. 
Local expansions show that, with c< (wu), ¢(u) > two computable functions, there holds: 


1 1 


F(z,u) ~ c<(u) F(z,u) ~ cs(u) 


JI — 22’ JI — 22 
Naturally, at u = 1, all words are counted and 
1 
F(z,l)= : 
(2,1) == 


Thus, the corresponding phase transition diagram is (see Figure 16): 


u=l-e u=l1 w=Il1+e 


@M=F PU=F pPw=F 


p p Pw=az 
(Negative singularities have a smaller weight and may _ be _ discarded.) 
END OF EXAMPLE IX.31. 


In this particular case, elementary combinatorics yields the arcsine distribution 
without the need of a recourse to singularities. The point to be made here is that 
the arcsine law could be expected when a similar phase transition diagram occurs. 
There is of course universality in this singular view of the arcsine law, which can be 
extended to walks with zero drift (Chapter VII). This kind of universality is a parallel 
to the universality of Brownian motion, which is otherwise familiar to probabilists. 


>> IX.40. Number of maxima and other stories. The construction underlying (64) also serves 
to analyse; (i) the number of times the maximum is attained. (ii) the difference between the 
maximum and the final altitude of the walk; (diz) the duration of the period following the last 
occurrence of the maximum. <q 


EXAMPLE IX.32. Path length in trees. A final example is the distribution of path length 
in trees, which has been studied by Louchard, Takacs and others [339, 340, 461, 462]. The 
distribution is known not to be Gaussian as results from computation of the first few moments. 
In the case of general Catalan trees, the analysis reduces to that of the functional equation 


1 

F = ——____... 

oH) 1— zF (zu, u) 
This defines F'(z,u) as a formal continued fraction, which suggests setting (cf Chapters III 
and V as well as our discussion of coin fountains and polyomino models) 
A(z) 
Biz)’ 
the variable w being viewed as a parameter. From the basic functional equation, there results 


A(z) = B(zu), B(z) = B(zu) — zB(zu’). 


F(z,u) = 


The functional equation for B may now be solved by indeterminate coefficients: 


yrr-l) yn 


ate Se pean clr aro 


Because of the quadratic exponents involved, the functions B(z) and F(z, u) have radius of 
convergence 0 when wu > 1, and are thus nonanalytic. In contrast, when u < 1, then B(z, wu) is 
an entire function of z, so that F(z, u) is meromorphic in z. Hence the singularity diagram: 
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u=l-e u=1 u=l1+e 


@>F PM=F PW= 


p =Z7 ?P 
The limit law is the Airy area distribution, that is related to the Airy function [340, 339, 461, 
462]. By an analytical tour de force, Prellberg [401] has developed a method based on cintegral 
representations and oalescing saddle points (Chapter VIII) that permits us to extract the phase 
transition diagram above, together with precise uniform asymptotic expansions. As similar 
problems occur in relation to connectivity of random graphs [205], future years should see 


more applications of Prellberg’s method. ..................4. END OF EXAMPLE IX.32. 


IX. 11.2. Semi-large powers, critical compositions, and stable laws. We con- 
clude this section by a discussion of critical compositions that typically involve con- 
fluences of singularities and lead to a general class of continuous distributions closely 
related to stable laws of probability theory. We start with an example where every- 
thing is explicit, that of zero contacts in random bridges, then state a general theorem 
on “semi-large” powers of functions of singularity analysis type, and finally discuss 
combinatorial applications. 


EXAMPLE IX.33. Zero-contacts in bridges. Consider once more fluctuations in coin tossings, 
and specifically bridges, corresponding to a conditioning of the game by the fact that the final 
gain is 0 (negative capitals are allowed). These are sequences of arbitrary positive or negative 
“arches”, and the number of arches in a bridge is exactly equal to the number of intermedaite 
steps at which the capital is 0. From the arch decomposition, theer results that the ordinary BGF 
of bridges with z marking length and wu marking zero-contacts is 


1 
ee 1 — 2uz2 D(z) 
Analysing this function is conveniently done by introducing 
1 1 

F(z,u)=B (Sve) = = 
The phase transition diagram is then easily found to be: 

u=l-e u=l1 u=l1+e 

py =1 pQ=1 pwy=1-U-w) 

Z LZ” LZ” 


Thus, there are discontinuities, both in the location of the singularity and the exponent. But 
these are of a type different from what gave rise to the arcsine law of random walks. 

The problem of the limit law is here easily solved since explicit expressions are provided 
by the Lagrange Inversion Theorem. One finds: 


[u*][2"|F(z,u) = [2"](1-V1—2z) 
- ~w" 2 —w)" gk-an B Pes : 


Then Stirling’s formula provides: 
Proposition IX.16. The number X,, of zero-contacts of a random bridge of size 2n satisfies, 


as — oo the local limit law, 


: zr —2?/4 
lim P(Xn =2Vn) = — =e” "4, 
n=00 2/n 
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FIGURE IX.17. The G-functions for 1 = 0.1..0.8 (left; from bottom to top) 
and for \ = 1.2..1.9 (right; from top to bottom); the thicker curves represent the 
Rayleigh law (left, \ = 4) and the Airy law (right, A = 3). 


for x in any compact set of (0, +00|. 


A random variable with density and distribution function given by 


(65) r(x) = ae R(a) =1-e7 4, 


is called a Rayleigh law. Thus the number of zero contacts obeys a Rayleigh law in the asymp- 
Ole MIs ssdedcponicesa rs ands sons enare se sande seine END OF EXAMPLE IX.33. 


> IX.41. Cyclic points in mappings. The number of cyclic points in mappings has exponential 
BGF (1 — uT'(z))~+, with T the Cayley tree function. The singularity diagram is of the same 
form as in Example 33. Explicit forms are available by Lagrange inversion, and the limit law is 
again Rayleigh. (Note: This has been vastly generalized by Drmota and Soria [137, 138].) < 


Both Example 33 and Note 41 exemplify the situation of an analytic composition 
scheme of the form (1 — uf(z))~! which is critical, since in each case f assumes 
value | at its singularity. Both can be treated elementarily since they involve powers 
that are amenable to Lagrange inversion, eventually resulting in a Rayleigh law. As 
we now explain, there is a family of functions that appear to play a universal rdéle 
in problems sharing such singular types. What follows is taken from an article by 
Banderier et al. [22]. 

We first introduce a function G that otherwise naturally surfaces in the study of 
stable? distributions in probability theory. For any parameter \ € (0, 2), define the 
entire function 


Xn probability theory, stable laws are defined as the possible limit laws of sums of independent iden- 
tically distributed random variables. The function G is a trivial variant of the density of the stable law of 
index ; see Feller’s book [162, p. 581-583]. Valuable informations regarding stable laws may be found in 
the books by Breiman [72, Sec. 9.8], Durett [143, Sec. 2.7], and Zolotarev [516]. 
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Z Srey tak BE tA) sin(7kA) (0<A< 1) 


(66) G(a,d) "ASI T+&) 
mee = heel _1,TQ+k/A) . 
TL (0 eee sin(tk/A) (1<A< 2) 


The function G(a; 4) is a normalized variant of the Rayleigh distribution (65). The 
function G(2; 3) constitutes the density of the “Airy map” distribution found in ran- 
dom maps as well as in other colascence phenomena and discussed in detail below, 


see (73). 


Theorem IX.15 (Semi-large powers). The coefficient of z” in a power H(z)* of a A- 
continuable function H(z) with singular exponent X admits the following asymptotic 
estimates. 

(i) For0 < <1, that is, H(z) =o —hy(1— z/p)* + O(1 — z/p), and when 
k = xn, with x in any compact subinterval of (0, +00), there holds 


(67) [z"|H*(z) ~ ok pnt G (2) ; 
nr oO 


(ii) For 1 <  < 2, thatis, H(z) =o —hi(1—2z/p) +ha(1—z/p)*>+O0((1 


z/p)*), when k = n+ an‘/A, with x in any compact subinterval of (—00, +00), 
there holds 
1 anita 
n k k —n 1/X 1 
(68) 2") (2) ~ o%p "Ty (hi /ha) h 6(2r. 


(itt) For X > 2, a Gaussian approximation holds. In particular, for 2 < X < 3, 
that is, H(z) =o — hy(1 — 2/p) + ha(1 — z/p)? — ha(1 — z/p) + O((1 —2/0)8), 
when k = Rn +2x4/n, with x in any compact subinterval of (—0o, +00), there holds 


Loofha 42/20 ho _ Ay) -2 
— Ext (40 with a = 2(22 — 21)¢?/h?. 
Vn aV/2n ( ) [ht 


1. 20 

The term “semi-large” refers to the fact that the exponents k in case (7) are of the 
form O(n°) for some @ < 1 chosen in accordance with the region where an “interest- 
ing” renormalization takes place and dependent on each particular singular exponent. 
When the interesting region reaches the O(n) range in case (<i7), the analysis of large 
powers, as detailed in Chapter IX, starts to apply and Gaussian forms results. 
PROOF. The proofs are somewhat similar to the basic ones in singularity analysis, but 
they require a suitable adjustment of the geometry of the Hankel contour and of the 
corresponding scaling. 

Case (i). A classical Hankel contour, with the change of variable z = p(1—t/n), 
yields the approximation 


k,n és 
[2 H¥(z) ~ —ZP fem * at 


(69) [z"]|H*(z) ~ o®p-” 
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The integral is then simply estimated by expanding exp(—’) and integrating 
termwise 


ok p—" <> (=a)F (hy\" 1 
(70) [2 NT a () Tow’ 


which is equivalent to Equation (67), by virtue of the complement formula for the 
Gamma function. 

Case (ii). When 1 < X < 2, the contour of integration in the z-plane is chosen 
to be a positively oriented loop, made of two rays of angle 7/(2X) and —7/(2.) that 
intersect on the real axis at a distance 1/ n'/A left of the singularity. The coefficient 
integral of H* is rescaled by setting z = p(1 — t/n!/>), and one has 


h x 
E"|HF(2) ~ —22 / eh 6 Et at 


There, the contour of integration in the t-plane comprises two rays of angle 7/A and 
—r/X, intersecting at —1. Setting u = hy /hi, the contour transforms into a clas- 
sical Hankel contour, starting from —oo over the real axis, winding about the origin, 
and returning to —co. So, with a = 1/,, one has 


Expanding the exponential, integrating termwise, and appealing to the complement 
formula for the Gamma function finally reduces this last form to (68). 

Case (iti). This case is only included here for comparison purposes, but, as 
recalled before the proof, it is essentially implied by the developments of Chapter IX 
based on the saddle point method. When 2 < A < 3, the angle ¢ of the contour of 
integration in the z—plane is chosen to be 7/2, and the scaling is \/n: under the change 
of variable z = p(1 — t/./n), the contour is transformed into two rays of angle 7/2 
and —7-/2 (i.e., a vertical line), intersecting at —1, and 


—n 


arp 
Qin /n 
he hy hyx 


with p = 72 — 54. Complementing the square, and letting u = t — 3pa? We get 


2_hiz 
ePt "ct dt, 


[2"]H*(z) ~ — 


is ok ae hn? 2 5 
POO eo ae i or du, 
Qin /n 

which gives Equation (69). By similar means, such a Gaussian approximation can be 
shown to hold for any non-integral singular exponent \ > 2. 
> IX.42. Zipf laws. Zipf’s law, named after the Harvard linguistic professor George Kingsley 
Zipf (1902-1950), is the observation that, in a language like English, the frequency with which 
a word occurs is roughly inversely proportional to its rank—the kth most frequent word has 
frequency proportional to 1/k. The generalized Zipf distribution of parameter a > 1 is the law 
of a variable Z such that fa fi 


C(a) ke 
It has infinite mean for a < 2 and infinite variance for a < 3. It was proved in Chapter VI that 
polylogarithms are amenable to singularity analysis. Consequently, the sum of a large number 


P(Z=k)= 
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of independent Zipf variables satisfies a local limit law of the stable type with index a — 1 


(a # 2). <J 


EXAMPLE IX.34. Mean level profiles of trees. Consider the depth of a random node in 
a random tree taken from a simple variety ) that satisfies the usual analytic assumptions of 
Chapter VII. The problem of quantifying this distribution is equivalent to that of determining 
the mean level profile, that is the sequence of numbers //,,,;, representing the mean number of 
nodes at distance k from the root. (The probability that a random node lies at level k is then 
My,x/n.) The first few levels have been characterized in Chapter VII, and the analysis of that 
chapter can now be completed thanks to Theorem [X.15. The problem was solved by Meir 
and Moon [356] in an important article that launched the analytic study of simple varieties of 
trees. As usual, we let d(w) be the generator of the simple variety VY, with Y(z) satisfying 
Y = z9(Y), and we designate by 7 the positive root of the characteristic equation: 


r$'(r) — $(r) =0. 


It is known from Chapter VII that the GF Y(z) has a square root singularity at p = 7/¢(7T). 
We also assume aperiodicity of ¢. Then Meir and Moon’s major result (Theorem 4.3 of [356]) 
is as follows 


Proposition IX.17 (Mean level profiles). The mean profile of a large tree in a simple variety 
obeys a Rayleigh law in the asymptotic limit: for k/\/n in any bounded interval of R>o, the 
mean number of nodes at altitude k satisfies asymptotically 


Maw Ake4?/@”) 
where A = 7" (r). 


(Note: Meir and Moon base their analysis on a Lagrangean change of variable and on the 
saddle point method.) 
PROOF. For each k, define Y;,(z, wu) to be the BGF with u marking the number of nodes at 
depth k. Then, the root decomposition of trees translates into the recurrence: 


Yn (z,u) = 26(¥n-1(Z, u)), Yo(z,u) = zud(Y (z)) = uY(z). 


By construction, we have 


Meee race (Fre) 


On the other hand, the fundamental recurrence yields 


u=l1 


(Zvi) _ = Ge). 


Now, ¢’(Y) has, like Y, a square root singularity. The semi-large powers theorem applies 


with \ = 3, and the result follows. The same method of gives access to the 
variance of the number of nodes at any depth k. The variance of the altitude of a random node 
is also easily computed [356]. .............. 0. 0c cee eee eee END OF EXAMPLE IX.34. 


> IX.43. The number of cyclic points in mappings. In the basic case of random mapping, we 
are dealing with F(z,u) = (1 — uT'(z))~', and a Rayleigh law holds. This extends to the 
number of cyclic points in a simple variety of mappings (e.g., mappings defined by a finite 
constraint on degrees). <q 
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[> IX.44. The width of trees. The expectation of the width W of a tree in a simple variety 


satisfies 
CiJ/n < Ey, (W) < CoVnlogn, 
for some C,C2 > 0. This is due to Odlyzko and Wilf [379] in 1987. (Better bounds are 


now known, since W,,/./n has been later recognized to be related to Brownian excursion. In 
particular, the expected width is ~ c\/n.) dq 


The results of Theorem IX.15 provide in addition useful information on compo- 
sition schemas of the form 


M(z,u) = C(uH(z)), 


provided C' and H are algebraic-logarithmic in the sense above. Combinatorially, 
this represents a substitution between structures, MM = C o H, and the coefficient 
[z"u*]M(z,u) counts the number of M-structures of size n whose C-componnet, 
also called core in what follows, has size k. Then the probability distribution of core- 
size X,, in M-structures of size n is given by 


[z*]C(z) 
[2"]C(A(z)) 
The case where the schema is critical, in the sense that H(ry) = re with ry,rc 
the radii of convergence of H, G, follows as a direct consequence of Theorem IX.15. 


What comes out is the following informally stated general principle (details would 
closely mimic the statement of Theorem IX.15 and are omitted). 


P(X, =k) = [2"]H(z)*. 


Proposition [X.18 (Critical compositions). In a composition schema G(uH(z)) 
where H and G have singular exponents , ' with X' < 2: 

(i) for 0 < X < 1, the normalized core-size X»,/n* is spread over (0, +00) 
and it satisfies a local limit law whose density involves the stable law of index ; in 
particular, \ = 4 corresponds to a Rayleigh law. 

(it) for 1 < X < 2, the distribution of X,, is bimodal and the “large region” 
Xn =cen+ xn*/A leads to a stable law of index ; 

(iit) for 2 < X, the standardized version of Xy, admits a local limit law that is of 
Gaussian type. 


Similar phenomena occur when \’ > A, but with a greater preponderance of 
the “small” region. Many instances have already appeared scattered in the literature. 
especially in connection with rooted trees. For instance, this proposition explains well 
the occurrence of the Rayleigh law (A = 4) as the distribution of cyclic points in 
random mappings and of zero-contacts in random bridges. The case = 3/2 appears 
in forests of unrooted trees (see the discussion in Chapter VIII for a complementary 
approach based on coalescing saddle points) and it is ubiquitous in planar maps, as 
attested by the article of Banderier et al. on which this subsection is largely based [22]. 
We detail one of the cases in the following example, which explains the meaning of 
the term “large region” in Proposition IX.18. 


EXAMPLE IX.35. Biconnected cores of planar maps. The OGF of rooted planar maps, with 
size determined by the number of edges, is by Chapter VII, 


1 


(71) M(z) = Ba? 


(1 — 182 —(1- 12z)*/?) ; 
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7 —=3 = 7 0 7 2 vo) 200 400, 600 800 7000 


FIGURE IX.18. Left: The standard Airy distribution. Right: Observed frequen- 
cies of core-sizes k € [20; 1000] in 50,000 random maps of size 2,000, showing 
the bimodal character of the distribution. 


with a characteristic 3 exponent. Define a separating vertex or articulation point in a map to 
be a vertex whose removal disconnects the graph. Let C denote the class of nonseparable maps, 
that is, maps without an articulation point (also known as biconnected maps). Starting from 
the root edge, any map decomposes into a nonseparable map, called the “core” on which are 
grafted arbitrary maps, as illustrated by the following diagram: 


There results the equation: 
(72) M(z)=C(H(z)), H(z) = z(1+ M(2))’. 
This gives in passing the OGF of nonseparable maps as the algebraic function of degree 3 
specified implicitly by the equation 
CO? + 20? + (1 — 18z)C + 272? — 22 =0, 

with expansion at the origin (EJS 4000139): 

(3k)! 
(kK+1)'(2k +1)! 
(The closed form results from a Lagrangean parameterization.) Now the singularity of C is also 


of the Z°/? type as seen by inversion of (72) or from the Newton diagram attached to the cubic 
equation. We find in particular 


1 4 8/3 
81 


C(z) =2z2+27 4222462442229 4+912°+.--, C,=2 


C(z) = = — =(1 — 27/4) 4 (1 — 27z/4)°/? + O((1 — 272/4)?), 


3. ¢«O9d 
which is reflected by the asymptotic estimate, 
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The parameter considered here is the distribution of the size X,, of the core (containing 
the root) in a random map of size n. The composition relation is M = C o H, where H = 
Z(1+.M)?. The BGF is thus M(z,u) = C(uH(z)) where the composition C' o H is of the 
singular type Z 3/2 4 73/2 What is peculiar here is the “bimodal” caracter of the distribution 
of core-size (see Figure 18 borrowed from [22]), which we now detail. 

First straight singularity analysis shows that, for fixed k, 


n k 
where ho = x is the value of H(z) at its singularity. In other words, there is local convergence 
of the probabilities to a fixed discrete law. The estimate above can be proved to remain uniform 
as long as k tends to infinity sufficiently slowly. We shall call this the “small range” of k values. 
Now, summing the probabilities associated to this small range gives the value C(ho) = 3. 
Thus, one-third of the probability mass of core-size arises from the small range, where a discrete 
limit law is observed. 

The other part of the distribution constitutes the “large range” to which Theorem IX.15 
applies. This contains asymptotically 2 of the probability mass of the distribution of X,. In 


that case, the limit law is given by a G(z; 3) law, also known as “map Airy” law and one finds 
2/3 


P(Xn =k) =Ck 


fork = zn + xn*/°, the continuous local limit: 


(73) P(Xn) ~ 5A(G2"2), A(x) = pared (x Ai(x”) — Ai/(x7)) . 


There Ai(a) is the Airy function , and A(x) defines the map Airy distribution displayed in 
Figure 18, a variant of the stable law of index 3. Sita ste tts END OF EXAMPLE IX.35. 


The bimodal character of the law can now be bettler understood following [22]. 
A random maps decomposes completely into biconnected components and the largest 
biconnected component has, with high probability, a size that is O(n). There are also a 
large number (O(n)) “dangling” biconnected components. In a rooted map, the root is 
in a sense placed “at random”. Then, with a fixed probability is either lies in the large 
compoent (in which case, the distribution of that large component is observed, this is 
the continuous part of the distribution given by the Airy map law), or else one of the 
small components is picked up by the root (this is the discrete part of the distribution). 


> IX.45. Critical cycles. The theory adapts to logarithmic factors. For instance the critical 
composition F(z, u) = — log(1 — ug(z)) leads to developments similar to those of the critical 
sequence. In this way, it becomes possible for instance to analyse the number of cyclic points 
in a random connected mapping. 


> IX.46. The base of supertrees. Supertrees defined in Chapter VI are trees rooted on trees. 


Here we consider the bicoloured variant K = G(2ZG), with G the class of general Catalan 
trees. Then, the law of the external G-component is related to a stable law of index x. <J 


IX. 12. Multivariate limit laws 


There exist natural extensions of continuity theorems, both for PGFs and for inte- 
gral transforms. Consider for instance the joint distribution of the numbers \1, x2 of 
singletons and doubletons in random permutations. Then, the parameter y = (x1, x2) 
has a trivariate EGF 

exp((ur — 1)z + (ug — 1)z?/2) 


F(z, U1, ua) = Maa = | “hatproe =) oe 
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Thus, the bivariate PGF satisfies, by meromorphic analysis, 


Pn(u1, U2) = [2") F(z, uw, ua) ~ eft) elta—1)/2, 


The joint distribution of (v1, x2) is then a product of a Poisson(1) and a Poisson(1/2) 
distribution; in particular y; and y2 are asymptotically independent. Such a fact 
results from an extension of the continuity theorem (Theorem IX.1) to multivariate 
PGF’s that is proved by multiple Cauchy integration. 

Consider next the joint distribution of y = (x1, x2), where x, is the number of 
j-summands in a random integer composition. Each parameter individually obeys a 
limit Gaussian law, since the sequence construction is supercritical. The trivariate GF 
is 

1 
1— 2(1— 2z)7! — (uy — 1)z — (ug — 1)2?° 
By meromorphic analysis, a higher dimensional quasi-power approximation may be 
derived: 


F(z, u1, 2) = 


[2"] F(z, ui, Ua) ~ c(u1, U2) p(ur, U2)”, 
for some 3rd degree algebraic function p(w, wz). In such cases, multivariate versions 
of the continuity theorem for integral transforms can be applied. See the book by Gne- 
denko and Kolmogorov [237], and especially the treatment of Bender and Richmond 
in [36]. As a result, the joint distribution is, in the asymptotic limit, a bivariate Gauss- 
ian distribution. Such generalizations are typical and involve essentially no radically 
new concept, just natural technical adaptations. 

A highly interesting approach to multivariate problems is that of functional limit 
theorems. There the goal is to characterize the joint distribution of a potentially in- 
finite collections of parameters. The limit process is then a stochastic process. For 
instance, the joint distribution of all altitudes in random walks gives rise to Brownian 
motion. The joint distribution of all cycle lengths in random permutations is described 
explicitly by Cauchy’s formula (Chapter III), and DeLaurentis and Pittel [115] have 
also shown convergence to the standard Brownian motion process. A rather spec- 
tacular application of this context of ideas was provided in 1977 by Logan, Shepp, 
Vershik and Kerov [336, 485]. These authors show that the shape of the pair of Young 
tableaux [302] associated to a random permutation conforms, in the asymptotic limit 
and with high probability, to a deterministic trajectory defined as the solution to a 
variational problem. In particular, the width of a Young tableau associated to a per- 
mutation gives the length of the longest increasing sequence of the permutation. By 
specializing their results, the authors were able to show that the expected length in a 
random permutation of size n is asymptotic to 2,/n, a long standing conjecture at the 
time. 


IX. 13. Notes 


This chapter is primarily inspired by the works of Bender and Richmond [28, 36, 
37], Canfield [76], Flajolet, Soria, and Drmota [134, 135, 137, 138, 210, 212, 443] as 
well as Hwang [272]. 

Bender’s seminal paper [28] initiated the study of bivariate analytic schemes that 
lead to Gaussian laws and the paper [28] may rightly be considered to be at the origin 
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of the field. Canfield [76], building upon earlier works showed the approach to extend 
to saddle point schemas. 

Tangible progress was next made possible by the development of the singular- 
ity analysis method [199]. Earlier works were mostly restricted to methods based on 
subtraction of singularities, as in [28], which is in particular effective for meromor- 
phic cases. The extension to algebraic—logarithmic singularities was however difficult 
given that the classical method of Darboux does not provide for uniform error terms. 
In contrast, singularity analysis does apply to classes of analytic functions, since it al- 
lows for uniformity of estimates. The papers by Flajolet and Soria [210, 212] were the 
first to make clear the impact of singularity analysis on bivariate asymptotics. Gao and 
Richmond [226] were then able to extend the theory to cases where both a singularity 
and its singular exponent are allowed to vary. 

From there, Soria developed considerably the framework of schemas in her doc- 
torate [443]. Hwang extracted the very important concept of “quasi-powers” in his 
thesis [272] together with a wealth of properties like full asymptotic expansions, 
speed of convergence, and large deviations. Drmota established general existence 
conditions leading to Gaussian laws in the case of implicit, especially algebraic, func- 
tions [134, 135]. The “singularity perturbation” framework for solutions of linear 
differential equations first appears under that name in [195]. The presentation in 
this chapter is very liberally based on the survey paper [191]. Finally, the books 
by Sachkov, see [420] and especially [422], offer a modern perspective on bivariate 
asymptotics applied to classical combinatorial structures. 

As pointed out in the introduction, the way combinatorial constructions induce 
limit laws via schemas based on a purely local perturbation of a singular structure 
is quite striking. Take for instance the principle that any fixed pattern occurs almost 
surely in a large random object and its number of occurrences is governed by Gaussian 
fluctuations. We have shown this property to hold true for strings, uniform tree models, 
and search trees. In a context that involves either a rational function, an algebraic 
function, or a solution to a nonlinear differential equation, it eventually reduces to a 
very simple property, a singularity that smoothly moves. .. 


I can see looming ahead one of those terrible exercises in probability where 
six men have white hats and six men have black hats and you have to 

work it out by mathematics how likely it is that the hats will get 

mixed up and in what proportion. If you start thinking about 

things like that, you would go round the bend. Let me assure you of that! 


—AGATHA CHRISTIE 
(The Mirror Crack’d. Toronto, Bantam Books, 1962.) 


Part D 


APPENDICES 


APPENDIX A 


Auxiliary Elementary Notions 


This appendix contains entries arranged in alphabetical order regarding the following topics: 
Arithmetical functions; Asymptotic Notations; Combinatorial probability; Cycle 


construction; Formal power series; Lagrange Inversion; Regular languages; Stir- 
ling numbers; Tree concepts. 


The corresponding notions and results are used throughout the book, and in particular in Part A 
relative to Symbolic Methods. 


1. Arithmetical functions. A general reference for this section is Apostol’s book [12]. 
First, the Euler totient function p(k) intervenes in the unlabelled cycle construction. 
It is defined as the number of integers in [1, k] that are relatively prime to k. Thus, 
one has y(p) = p— Lif p € {2,3,5,...} is a prime. More generally when the prime 
number decomposition of k is k = p{* --- p?", then 


o(k) = pf}! (pi — 1)-+- per (py — 1). 


A number is squarefree if it is not divisible by the square of a prime. The Mébius 
function 4(n) is defined to be 0 if n is not squarefree and otherwise is (—1)" if n = 
Pp1°**Pr is a product of r distinct primes. 

Many elementary properties of arithmetical functions are easily established by 
means of a Dirichlet generating functions (DGF). Let (an)n>1 be a sequence; its 
DGF formally defined by 


Sa 
nr 
a(s) = Pe 
n=1 
In particular, the DGF of the sequence a,, = 1 is the Riemann zeta function, ¢(s) = 
yo >,” °. The fact that every number uniquely decomposes into primes is reflected 


by Euler’s formula, 
1\72 
(1) ¢(s) = [J (1 - =) ; 


where p ranges over the set P of all primes. (As observed by Euler, the fact that 
¢(1) = oo in conjunction with (1) provides a simple analytic proof that there are 
infinitely many primes! See Note IV.1, p. 215) 

Equation (1) implies elementarily that 


p(n) 1 1 
2) M(s) = 7 4 -I10-4)-< 
n>1 . pEP P ¢(s) 
where (7) is the Mébius coefficient defined above. 
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Finally, if (an), (bn), (Cn) have DGF a(s), 3(s),y(s), then one has the equiva- 
lence 
a(s) = B(s)y(s) SS anr= S- baCn/d- 
d|n 
In particular, taking c, = 1 (7(s) = ¢(s)) and solving for 3(s) shows (using (2)) the 


implication 
an = ye ba =S>> bn = S- U(d)an/a, 
d|n d|n 
which is known as Mébius inversion. This relation is used in the enumeration of 
irreducible polynomials (Section I. 6.3). 


2. Asymptotic Notations. _ Let S be a set and so € S a particular element of S. We 
assume a notion of neighbourhood to exist on S. Examples are S = Z39U {+00} with 
$89 = +oo, S = R with so any point in R, and S = C ora subset of C with so = 0, 
and so on. Two functions ¢ and g from S \ {so} to C are given. 

— O-notation: write 


(5) =, O(0(s)) 


if the ratio 6(s)/g(s) stays bounded as s — so in S. In other words, there 
exists a neighbourhood V of sg and a constant C’ > 0 such that 


le(s)|<Clg(s)l,  s€V, s# 50. 
One also says that “@ is of order at most g, or ¢ is big—Oh of g (as s tends 
to 89)”. 
— »-notation: write 
(8) ~ 9(s) 


if the ratio ¢(s)/g(s) tends to 1 as s — so in S. One also says that “¢ and g 
are asymptotically equivalent (as s tends to so)”. 
— o-notation: write 


als) = o(g(s)) 


if the ratio ¢(s)/g(s) tends to 0 as s > so in S. In other words, for any 
(arbitrarily small) « > 0, there exists a neighbourhood V- of sp (depending 
on €), such that 


I9(s)| Selg(s)|, seVe, s# 0. 
One also says that “@ is of order smaller than g, or ¢ is little-oh of g (as s 
tends to 89)”. 


These notations are due to Bachmann and Landau towards the end of the nineteenth 
century. See Knuth’s note for a historical discussion [309, Ch. 4]. 
Related notations, of which however we only make scanty use, are 


— {)-notation: write 
a(s) = 29(s)) 


if the ratio ¢(s)/g(s) stays bounded from below in modulus by a nonzero 
quantity, as s — sg in S. One then says that ¢ is of order at least g. 
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— O-notation: write 


$(s) = O(9(s)) 


S80 
if @(s) = O(s) and $(s) = Q(s). One then says that ¢ is of order exactly g. 
For instance, one has as n — +00 in Zyo: 
sinn = o(logn); logn = O(/n);  logn = o(./n); 
(5) =Q(nVn); mn + Vn = O(n). 


As x — 1 in R<j, one has 
Vl—-ax=o(1); e*=O(sinz); logx = O(a—- 1). 


We take as granted in this book the elementary asymptotic calculus with such no- 
tations (see, e.g., [434, Ch. 4] for a smooth introduction close to the needs of analytic 
combinatorics and de Bruijn’s classic [111] for a beautiful presentation.). We shall 
retain here in particular the fact that Taylor expansions imply asymptotic expansions; 
for instance, the convergent expansions valid for |u| < 1, 


CO 7_4)\k ORS 
log(1-tu) = S- ( 1 u®, exp(u) = XC aut, (l-u)-% = ~ e : a, 


k=1 k>0 k>0 


imply (as u — 0) 

2 
log(1+u) = ut+O(u2), exp(u) = Ltut—+0(u3), (1-u)'/? = 1-5+0(w), 
and, in turn, (as n — +00) 


Let dl 1 Lee 1 1 
bets om fa ea nea 
os ( +1) ~+0(3), ( — so t(r) 


Two important asymptotic expansions are Stirling’s formula for factorials and the 
harmonic number approximation, valid for n > 1, 


nt = ne "V2rn (1+ en), O< en < ae 
(3) i oer =O 
H,, = | es ify rs = ; = 0.57721, 
OB Rhy on. ae in ) si 


that are best established as consequences of the Euler—Maclaurin summation formula 
(see [111, 434] as well as APPENDIX B: Mellin transform, p. 707). 

Asymptotic scales. An important notion due to Henri Poincaré is that of an as- 
ymptotic scale. A sequence of functions wo, wi, ... is said to constitute an asymptotic 
scale if all functions w; exist in a common neighbourhood of sy € S and if they satisfy 
there, for all 7 > 0: 


w741(8) = 0(w;(s)), ie. lim 2” =0 
Examples at 0 are the scales: u;(x2) = 2; voj;(x) = x logax and v2;41(x) = 2; 


w;(x) = x/?, Examples at infinity are t;(n) = n~J, and so on. Given a scale 
® = (w,(s));>0, a function f is said to admit an asymptotic expansion in the scale ® 
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if there exists a family of complex coefficients (A, ) (the family is then necessarily 
unique) such that, for each integer m: 


(4) f(s) =~ dywj(8) + Olwmar(s)) (8 80). 
j=0 
In this case, one writes 


(5) f(s) ~ > Ajwj(s), (8 > 40) 
j=0 


> 


with an extension of the symbol ‘~’. (Some authors prefer the notation ‘~”.) The 
scale may be finite and in most cases, we do not need to specify it as it clear from 
context. For instance, one can write 


1 1 2 
Hy, ~ logn+y7+— tang ~ e+ -a¢? + —2°. 
Tee Ta 3° 15 
In the first case, it is understood that n — oo and the scale is logn,1,n~1,n7? 
In the second case, x — 0 and the scale is x, x°,x°,.... Note that in the case of an 
infinite expansion, convergence of the infinite sum is not implied in (5): the relation is 
to be interpreted literaly in the sense of (4) as a collection of more and more precise 


descriptions of f as s becomes closer and closer to so. 


poses 


> 1.1. Simplification rules for the asymptotic calculus. Some of them are 


O(Af) —- - OF) (\ # 0) 
O(f)£0(9) — O(lfl +lgl) 
OCF) if g = O(f) 
O(f + 9) — =O(f)O(g). 
Similar rules apply for o(-). <q 


>> 1.2. Harmonics of harmonics. The harmonic numbers are readily extended to non-integral 
index by (cf also the w function p. 692) 


al 1 
k=1 7e 


For instance, H;/2 = 2 — 2 log 2. This extension is related to the Gamma function [492], and it 
can be proved that the asymptotic estimate (3), with x replacing n, remains valid as x — +00. 
A typical asymptotic calculation shows that 


y+3 1 
Hu,, = loglogn+7+ ee +0/ 5 ). 


What is the shape of an asymptotic expansion of Hu, ? <q 
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> 1.3. Stackings of dominos. A stock of dominos of length lcm is given. It is well known that 
one can stack up dominos in a harmonic mode: 


Estimate within 1% the minimal number of dominos needed to achieve a horizontal span of 
Im (=100cm). [Hint: about 1.50926 104° dominos!] Set up a scheme to evaluate this integer 


exactly, and do it! dq 
> 1.4. High precision fraud. Why is it that, to forty decimal places, one finds 
500,000 (-1)82 
4 » yas =  3.14159065358979324046264338326 9502884197 
k=1 
T =  3.141592653589793238462643383279502884197, 


with only four “wrong” digits in the first sum? (Hint: consider the simpler problem 
1 


9801 
Many fascinating facts of this kind are found in works by Jon and Peter Borwein [63, 64]. < 


= 0.00 01 02 03 04 05 06 07 08 09 10 11 12 13 1415 16 17 18 19 20 21 22 23 2425... .) 


Uniform asymptotic expansions. The notions previously introduced admit of 
uniform versions in the case of families dependent on a secondary parameter [111, 
pp. 7-9]. Let {fu(s)}ueu be a family of functions indexed by U. An asymptotic 
equivalence like 

fuls) =O (9(s)) (8 > 80), 
is said to be uniform with respect to u if there exists an absolute constant Kv (indepen- 
dent of u € U) and a fixed neighbourhood Y of so such that 


Yueu,Vseyv: lfu(s)| < Klg(s)]. 


This definition in turn gives rise to the notion of a uniform asymptotic expansion: it 
suffices that, for each m, the O error term in (4) be uniform in the sense above. Such 
notions are central for the determination of limit laws in Chapter IX, where a uniform 
expansion of a class of generating functions near a singularity is usually required. 

> L5. Examples of uniform asymptotics. One has uniformly, for u € Rand u € {0, 1] respec- 


tively: 
BCE) <= OC: (: | *) = 1444 o(3). 


However, the second expansion no longer holds uniformly with respect to wu when wu € R (take 


u = +n), though it holds pointwise (non-uniformly) for any fixed uw € R. What about the 
u 2 

assertion (1 *) ae eee) (5) for u € R? <J 
n nm— Co n nM 


3. Combinatorial probability. This entry gathers elementary concepts from proba- 
bility theory specialized to the discrete case and used in Chapter III. A more elaborate 
discussion of probability theory forms the subject of Appendix C. 
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Given a finite set S, the uniform probability measure assigns to any 0 € S the 
probability mass 
1 


P(o) = ——~. 
(0) card(S) 
The probability of any set, also known as event, € C S, is then measured by 


_ card(Eé) _ 
ae card(S) mY) 


(“the number of favorable cases over the total number of cases’). 

Given a combinatorial class A, we make extensive use of this notion with the 
choice of S = A,. This defines a probability model (indexed by 7), in which of ele- 
ments of the size n in A are taken with equal likelihood. For this uniform probabilistic 
model, we write 

Pr and Pa, 
whenever the size and the type of combinatorial structure considered need to be em- 
phasized. 

Next consider a parameter x, which is a function from S to Z>o. We regard such 
a parameter as a random variable, determined by its probability distribution, 


card ({o | x(o) = k}) 
card(S) 


The notions above extend gracefully to nonuniform probability models that are deter- 
mined by a family of nonnegative numbers (p,)>es which add up to 1: 


P(c)=po, P(E) => po, Prx=k)= DS) do. 


oc€ x(o)=k 


P(x =k) = 


Moments. An important information on a distribution is provided by its moments. 
We state here the definitions for an arbitrary discrete random variable supported by Z 
and determined by its probability distribution, P(X = k) = px where the (px)rez 
are nonnegative numbers that add up to 1. The expectation of f(X) is defined as the 
linear functional 


(f(X)) = DU P{X = k}- F(R). 
k 
In particular, the (power) moment of order r is defined as the expectation: 


(X") = $0 P{X =k} +k". 
k 


Of special importance are the first two moments of the random variable X. The 
expectation (also mean or average) E(X) is 


aX) SS PX Sky. 
k 


The second moment E(X°) gives rise to the variance, 


V(X) =E ((X — E(X))’) = E(X”) — E(x)’, 
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and, in turn, to the standard deviation 


The mean deserves its name as first observed by Galileo Galilei (1564-1642): ifa large 
number of draws are effected and values of X are observed, then the arithmetical mean 
of the observed values will normally be close to the expectation E(X ). The standard 
deviation measures in a mean quadratic sense the dispersion of values around the 
expectation E(X). 


> 1.6. The weak law of large numbers. Let (Xz) be a sequence of mutually independent 
random variables with a common distribution. If the expectation 1 = E(X;,) exists, then for 
every €: 


lim P([2( +++ +x) a] ><) =0. 


n— oo 


(See [161, Ch X] for a proof.) Note that the property does not require finite variance. J 


Probability generating function. The probability generating function (PGF) of 
X is by definition: 


p(u) = S P(X = k)u*, 
k 


and an alternative expression is p,(u) = E(u* ). Moments can be recovered from the 
PGF by differentiation at 1, for instance: 


(X)= p(w), B(X(X—1)) = Solu) 


u=1 


More generally, the quantity, 


d* 
(X(X = 1) (X =k +1) = p(w) 


>| 
u=1 


is known as the kth factorial moment. 


> 1.7. Relations between factorial and power moments. Let X be a discrete random variable 
with PGF p(w); denote by uz, = E(X") its rth moment and by ¢, its factorial moment. One 
has 


br = Ofp(e)|,_4, or = Orp(u)|,,-1- 


Consequently, with et and al the Stirling numbers of both kinds (APPENDIX A: Stirling 
numbers, p. 680), 


or =>o(-)"4 os mad he, 


j J j 


(Hint: for ¢, — jor, expand the Stirling polynomial defined in (12) below; in the converse 
direction, write p(e’) = p(1 + (e* — 1)).) dd 
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Markov-Chebyshey inequalities. These are fundamental inequalities that apply 
equally well to discrete and to continuous random variables (see Appendix C for the 
latter). 


Theorem A.1 (Markov-Chebyshev inequalities). Let X be a nonnegative random 
variable and Y an arbitrary real random variable. One has for an arbitrary t > 0: 


P{X > #8(X)} < 


(Markov inequality) 


1 
t 
1 


P{|Y —E(Y)| > to(Y)} < 


(Chebyshev inequality). 


2 


HK 


PROOF. Without loss of generality, one may assume that x has been scaled in such 
a way that E(X) = 1. Define the function f(x) whose value is 1 if « > t, and 0 
otherwise. Then 


P{X >t} =E(f(X)). 


Since f(a) < w/t, the expectation on the right is less than 1/t. Markov’s inequality 
follows. Chebyshev’s inequality then results from Markov’s inequality applied to X = 
IY -E(Y) 2. 

Theorem A.1 informs us that the probability of being much larger than the mean 
must decay (Markov) and that an upperbound on the decay is measured in units given 
by the standard deviation (Chebyshev). 

Moment inequalities are discussed for instance in Billingsley’s reference trea- 
tise [55, p. 74]. They are of great importance in discrete mathematics where they 
have been put to use in order to show the existence of surprising configurations. This 
field was pioneered by Erdés and is often known as the “probabilistic method” [in 
combinatorics]; see the book by Alon and Spencer [9] for many examples. Moment 
inequalities can also be used to estimate the probabilities of complex events by reduc- 
ing the problems to moment estimates for occurrences of simpler configurations—this 
is one of the bases of the “first and second moment methods”, again pioneered by 
Erdés, which are central in the theory of random graphs [60, 283]. Finally, moment 
inequalities serve to design, analyse, and optimize randomized algorithms, a theme 
excellently covered in the book by Motwani and Raghavan [370]. 


4. Cycle construction. The unlabelled cycle construction is introduced in Chapter I 
and is classically obtained within the framework of Polya theory [98, 395, 397]. The 
derivation given here is based on an elementary use of symbolic methods that fol- 
lows [211]. It relies on bivariate GF’s developed in Chapter III, with z marking size 
and u marking the number of components. Consider a class A and the sequence class 
S = SEQs;(A). A sequence o € S is primitive (or aperiodic) if it is not the repetition 
of another sequence (e.g., @33aa is primitive, but a3a3 = (a3) is not). The class 
PS of primitive sequences is determined implicitly, 


S(z,u) — Sas = Drie): 
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which expresses that every sequence possesses a “root” that is primitive. Mdébius 
inversion then gives 


kk u® A(z*) 
PS(z,u) = >> w(k)S(z »U ) =) Hh) AG 


k>1 k>1 


A cycle is primitive if all of its linear representations are primitive. There is an 
exact one-to-¢ correspondence between primitive ¢-cycles and primitive ¢-sequences. 
Thus, the BGF PC(z, u) of primitive cycles is obtained by effecting the transforma- 
tion u’ + Su‘ on PS(z,u), which means 


o d 
PO(z,u) = f Plz,v)—, 
0 v 
giving after term-wise integration, 


L(k) 1 
k log 1—uk A(zk)’ 


Finally, cycles can be composed from arbitrary repetitions of primitive cycles 
(each cycle has a primitive “root”), which yields for C = Cyc(A): 


C(z,u) = SO PCG au: 


k>1 


The arithmetical identity } 74 ,, (d)/d = y(k)/k gives eventually 


k 1 


Formula (6) specializes to the one that appears in the translation of the cycle 
construction in the unlabelled case (Theorem I.1), upon setting wu = 1; this formula 
also coincides the statement of Proposition III.5 regarding the number of components 
in cycles, and it yields the general multivariate version (Theorem III.1) by a simple 
adaptation of the argument. 


> 1.8. Around the cycle construction. Similar methods yield the BGFs of multisets of cycles 
and multisets of aperiodic cycles as 


II : and : 

ese 1— uk A(z) 1—uA(z)’ 
respectively [112]. (The latter fact corresponds to the property that any word can be written 
as a decreasing product of Lyndon words. Notably, it serves to construct bases of free Lie 
algebras [337, Ch. 5].) J 


> 1.9. Aperiodic words. An aperiodic word is a primitive sequence of letters. The number of 
aperiodic words of length n over an m-ary alphabet corresponds to primitive sequences with 
A(z) = mz and is 
PWK™ = S~ u(d)m”/4. 
d|n 


For m = 2, the sequence starts as 2, 2, 6, 12, 30, 54, 126, 240, 504, 990 (EIS A027375). J 
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5. Formal power series. Formal power series extend the usual operations on polyno- 
mials to infinite series of the form 


(7) f= 0 fn”, 


n>0 


where z is a formal indeterminate. The notation f(z) is also employed. Let K be a 
ring of coefficients (usually we shall take one of the fields Q, R, C); the ring of formal 
power series is denoted by K|[z]] and it is the set K (of infinite sequences of elements 
of K) written as infinite power series (7) and endowed with the operations of sum and 
product, 


ee a Sone" = S (fn + 9n) 2” 
Se x Sone" 


> (>: fins) Ze, 
n k=0 

A topology, known as the formal topology, is put on K[[z]] by which two se- 
ries f,g are “close” if they coincide to a large number terms. First, the valuation of 
a formal power series f = )_,, fnz” is the smallest r such that f, 4 0 and is de- 
noted by val(f). (One sets val(0) = ++oo.) Given two power series f and g, their 
distance d(f, g) is then defined as 2~ Y"(/—9), With this distance (in fact an ultramet- 
ric distance), the space of all formal power series becomes a complete metric space. 
Roughly, the limit of a sequence of series { f G )} exists if, for each n, the coefficient 
of order n in f) eventually stabilizes to a fixed value as 7 > oo. In this way formal 
convergence can be defined for infinite sums: it suffices that the general term of the 
sum should tend to 0 in the formal topology, i.e., the valuation of the general term 
should tend to oo. Similarly for infinite products, where [[(1 + ud )) converges as 
soon as uJ) tends to 0 in the topology of formal power series. 

It is then a simple exercise to prove that the sum Q(f) := So,s9 f* exists (the 
sum convergerges in the formal topology) whenever fg = 0; the quantity then de- 
fines the quasi-inverse written (1 — f)~+, with the implied properties with respect to 
multiplication (namely, Q(f)(1 — f) = 1). In the same way one defines formally 
logarithms and exponentials, primitives and derivatives, etc. Also, the composition 
f °g is defined whenever go = 0 by substitution of formal power series. More gen- 
erally, any (possibly infinitary) process on series that involves at each coefficient only 
finitely many operations is well-defined and is accordingly a continuous functional in 
the formal topology. 


> 1.10. The OGF of permutations. The ordinary generating function of permutations, 


P(z) = So nle™ = 14+ 24 227 + 62° + 2424 + 1202° + 7202° + 50402" + --- 


n=0 


exists as an element of C/[[z]], although the series has radius of convergence 0. The quantity 
1/P(z) is for instance well-defined (via the quasi-inverse) and one can compute legitimately 
and effectively 1— 1/P(z) whose coefficients enumerate indecomposable permutations (p. 82). 
The formal series P(z) can even be made sense of analytically as an asymptotic series (Euler), 
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since 


co —t 
[ aa tw 1-2 + Oa? — Ble8 + diet —... (z — 0+). 


Thus, the OGF of permutations is also representable as the (formal, divergent) asymptotic series 
associated to an integral. 


It can be proved that the usual functional properties of analysis extend to formal 
power series provided they make sense formally. The extension to multivariate formal 
power series follows along entirely similar lines. 


6. Lagrange Inversion. Lagrange inversion (Lagrange, 1770) relates the coefficients 
of the inverse of a function to coefficients of the powers of the function itself. It 
thus establishes a fundamental correspondence between functional composition and 
standard multiplication of series. Although the proof is technically simple, the result 
is altogether non-elementary. 

The inversion problem z = h(y) is solved by the Lagrange series given below. It 
is assumed that [y°|]h(z) = 0, so that inversion is formally well defined and analyt- 
ically local, and [y']h(y) 4 0. The problem is is then conveniently standardized by 
setting h(y) = y/9(y). 


Theorem A.2. Let ¢(u) = Soys9 Gnu" be a power series of C[[z]] with do # 0. 
Then, the equation y = z(y) admits a unique solution in C||z]] whose coefficients 
are given by (Lagrange form) 


= 1 
8 = i . h —— n-1 nr 
(6) We) = Dy ame" where am = 7 fe] ol 
Furthermore, one has for k > 0 (Biirmann form) 
oe k 
ko (k) yn h (k) _ “[,n—-k n 
0) y= Dole", where ui = Tw Hou” 


By linearity, a form equivalent to Burmann’s (9), with H an arbitrary function, is 


n 1 n— n 
[2"]A(y(2)) = [vu] (A'(u)o()”). 
PROOF. The method of indeterminates coefficients provides a system of polynomial 
equations for {y,,} that is seen to admit a unique solution: 


y= $0, yr2=¢0b1, y3 = G00) + G2, .--- 


Since y,, only depends polynomially on the coefficients of #(w) till order n, one may 
assume without loss of generality, in order to establish (8) and (9) that ¢ is a poly- 
nomial. Then, by general properties of analytic functions, y(z) is analytic at 0 (see 
Chapter IV and APPENDIX B: Equivalent definitions of analyticity, p. 687 for def- 
initions) and it maps conformally a neighborhood of 0 into another neighbourhood 
of 0. Accordingly, the quantity ny, = [z”~"]y’(z) can be estimated by Cauchy’s 
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coefficient formula: 


1 d 
nn = =] y(2)—=  Wirect coefficient formula for y'(z)) 
2im Jos zie 
(10) 1 dy ’ 
= > — —— (Change of variable z +> y) 
2im Jos (y/o(y))” 
= [y""] dy)” (Reverse coefficient formula for d(y)”). 


In the context of complex analysis, this useful result appears as nothing but an avatar 
of the change-of-variable formula. The proof of Biirmann’s form is similar. 
There exist instructive (but longer) combinatorial proofs based on what is known 
as the “cyclic lemma” or “conjugacy principle” [407] for Lukasiewicz words. (See 
also Note 44 in Chapter I.) Another classical proof due to Henrici relies on properties 
of iteration matrices [266, §1.9]; see also Comtet’s book for related formulations [98]. 
Lagrange inversion serves most notably to develop explicit formule for simple 
families of trees (Chapters I and I), random mappings (Chapter ITI), and more gener- 
ally for problems involving coefficients of powers of functions. 
> L.11. Lagrange—Biirmann inversion for fractional powers. The formula 


2) (HP) = Soy” 


nt+a 


holds for any real or complex exponent a, and hence generalizes Biirmann’s form. One can 
similarly expand log(y(z)/z). dq 
> 1.12. Abel’s identity. By computing in two different ways the coefficient 

[zr]elot Ou = [z"Je™. PY, 


where y = ze” is the Cayley tree function, one derives Abel’s identity 


(a+ B)(ntats)"' =a8)) (;) (k+0)* 7 (n-k+ 8) FO. 


k=0 


J 


7. Regular languages. A language is a set of words over some fixed alphabet A. The 
structurally simplest (yet nontrivial) languages are the regular languages that, as as- 
serted on p. 54, can be defined in a variety of equivalent ways (see [3, Ch. 3] or [149]): 
by regular expressions, either ambiguous or not, and by finite automata, either deter- 
ministic or nondeterministic. Our definitions of S-regularity (S as in specification) 
and A-regularity (A as in automaton) from Chapter I correspond to definability by 
unambiguous regular expression and deterministic automaton, respectively. 


Regular expressions and ambiguity. Here is first the classical definition of a 
regular language in formal language theory. 


Definition A.1. The category RegExp of regular expressions is defined inductively 
by the property that it contains all the letters of the alphabet (a € A) as well as the 
empty symbol ¢, and is such that, if Ri, Ro © RegExp, then the formal expressions 
Ri, U Ro, Ry + Re and RJ are regular expressions. 

Regular expressions are meant to denote languages. The language L(R) asso- 
ciated to R is obtained by interpreting “U’ as set-theoretic union, ‘-’ as catenation 
product extended to sets and ‘*’ as the star operation: L(R*) := {e} UL(R) U 
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(L(R) - L(R)) U---. These operations rely on set-theoretic operations and place no 
condition on multiplicities (a word may be obtained in several different ways). Ac- 
cordingly, the notions of a regular expression and a regular language are useful when 
studying structural properties of languages, but they must be adapted for enumeration 
purposes, where unambiguous specifications are needed. 

A word w € L(R) may be parsable in several ways according to R: the ambiguity 
coefficient (or multiplicity) of w with respect to the regular expression R is defined! 
as the number of parsings and written K(w) = Kr(w). 

A regular expression R is said to be unambiguous if for all w, we have kp(w) € 
{0,1}, ambiguous otherwise. In the unambiguous case, if £ = L(R), then L is S- 
regular in the sense of Chapter I, a specification being obtained by the translation 
rules: 


(11) UR +, + xX, ()* +> SEQ, 


and the translation mechanism afforded by Proposition I.2 p. 48 applies. (Use of the 
general mechanism (11) in the ambiguous case would imply that we enumerate words 
with multiplicity (ambiguity) coefficients taken into account.) 


A-regularity implies S-regularity. This construction is due to Kleene [294] 
whose interest had its origin in the formal expressive power of nerve nets. Within 
the classical framework of the theory of regular languages, it produces from an au- 
tomaton (possibly nondeterministic) a regular expression (possibly ambiguous). 

For our purposes, let a deterministic automaton a be given, with alphabet A, set 
of states Q, with qo and Q the initial state and the set of final states respectively. 
The idea consists in constructing inductively the family of languages co of words 
that connect state q; to state g; passing only through states qo,...,q, in between q; 


and q;. We initialize the data with pie" to be the singleton set {a} if the transition 
(qi 0 a) = q; exists, and the emptyset (() otherwise. The fundamental recursion 


rae = poe +4 ae SEO(S)(e Mie 


rj 


incrementally takes into account the possibility of traversing the “new” state q,. (The 
unions are clearly disjoint and the segmentation of words according to passages 
through state qg, is unambiguously defined, hence the validity of the sequence con- 
struction.) The language £ accepted by a is then given by the regular specification 


ew per eae 
a €Q 


that describes the set of all words leading from the initial state qo to any of the final 
states while passing freely through any intermediate state of the automaton. 


' For instance if R = (a U aa)* and w = aaaa, then «(w) = 5 corresponding to the five parsings: 
Q:Q:4-4,Q4-Q-aa,a-aa-a,aa-a-a,aa-aa. 
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g. it = Unambiguous General 
ir aac RegExp RegExp 
TK lI 
A . = Deterministic RS Nondeterministic 
-regularity = = 


FIGURE I.1. Equivalence between various notions of regularity: K is Kleene’s construc- 
tion; RS is Rabin-Scott’s reduction; I is the inductive construction of the text. 


S-regularity implies A-regularity. An object described by a regular specification 
tcan be first encoded as a word, with separators indicating the way the word should be 
parsed unambiguously. These encodings are then describable by a regular expression 
using the correspondence of (11). Next any language described by a regular expression 
is recognizable by an automaton (possibly nondeterministic) as shown by an inductive 
construction. (We only state the principles informally here.) Let dt jo represent 
symbolically the automaton recognizing the regular expression t, with the initial state 
on the left and the final state(s) on the right. Then, the rules are schematically 


“fe = CBT 


\d seo 
a} = 
ad Joo = 


Tagepest. 


Finally, a standard result of the theory, the Rabin-Scott theorem, asserts that any non- 
deterministic finite automaton can be emulated by a deterministic one. (Note: this 
general reduction produces a deterministic automaton whose set of states is the pow- 
erset of the set of states of the original automaton; it may consequently involve an 
exponential blow-up in the size of descriptions.) 


8. Stirling numbers.. These numbers count amongst the most famous ones of com- 
binatorial analysis. They appear in two kinds: 


e the Stirling cycle number (also called ‘of the first kind’) (7 enumerates 
permutations of size n having k cycles; 

e the Stirling partition number (also called ‘of the second kind’) {it enumer- 
ates partitions of an n-set into k nonempty equivalence classes. 


The notations (7 and ea proposed by Knuth (himself anticipated by Karamata) are 
nowadays most widespread; see [248]. 

The most natural way to define Stirling numbers is in terms of the “vertical” EGFs 
when the value of k is kept fixed: 


El - #5) 
Sila owe 
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From there, the bivariate EGFs follow straightforwardly: 


S- Boer exp (uioe +) = (l—2z)-" 


n,k>0 


n,k>0 


Stirling numbers and their cognates satisfy a host of algebraic relations. For in- 
stance, the differential relations of the EGFs imply recurrences reminiscent of the 
binomial recurrence 


Hy lee ee ae ee 


By expanding the powers in the vertical EGF of the Stirling partition numbers or by 
techniques akin to Lagrange inversion, one finds explicit forms 


| Sere ee a 


SS oa 
8} - FQ 


Though comforting, these forms are not too useful in general. (The one relative to 
Stirling cycle numbers was obtained by Schlémilch in 1852 [98, p. 216].) 

A more important relation is that of the generating polynomials of the (”] for 
fixed n, 


n 


(12) Pa(u) = |] =u (41) (u2)-- (un 2) 


r=0 


This nicely parallels the OGF for the {0} for fixed r 


n=0 


> 1.13. Schlémilch’s formula. It is established starting from 


klyn} 1 low 1 dz 
milk| Qf °8 To2 ge 


via the change of variable a Ja Lagrange: z = 1 — e~*. See [98, p.216] and [202]. dq 


9. Tree concepts. In the abstract graph-theoretic sense, a forest is an acyclic (undi- 
rected) graph and a tree is a forest that consists of just one connected component. A 
rooted tree is a tree in which a specific node is distinguished, the root. Rooted trees 
are drawn with the root either below (the mathematician’s and botanists’s convention) 
or on top (the genealogist’s and computer scientist’s convention), and in this book, we 
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employ both conventions indifferently. Here are then two planar representations of the 
same rooted tree 


b b 
(13) ie. ee seer 
fa at aa 
g h i 1 ae 


where the star distinguishes the root. (Tags on nodes, a, b, c, etc, are not part of the tree 
structure but only meant to discriminate nodes here.) A tree whose nodes are labelled 
by distinct integers then becomes a labelled tree, this in the precise technical sense of 
Chapter II. Size is defined by the number of nodes (vertices). Here is for instance a 
labelled tree of size 9: 


5 
(14) 9 3 2 
BENS. © Mea 
6 4 8 1 
| 
7 


In a rooted tree, the outdegree of a node is the number of its descendants; with the 
sole exception of the root, outdeegree is thus equal to degree (in the graph-theoretic 
sense, i.e., the number of neighbours) minus |. Once this convention is clear, one 
usually abbreviates “outdegree” by “degree” when speaking of rooted trees. A leaf is 
a node without descendant, that is, a node of (out)degree equal to 0. For instance the 
tree in (14) has 5 leaves. Non-leaf nodes are also called internal nodes. 

Many applications from genealogy to computer science require superimposing 
an additional structure on a graph-theoretic tree. A plane tree (sometimes also called 
a planar tree) is defined as a tree in which subtrees dangling from a common node 
are ordered between themselves and represented from left to right in order. Thus, the 
two representations in (13) are equivalent as graph-theoretic trees, but they become 
distinct objects when regarded as plane trees. 

Binary trees play a special role in combinatorics. These are rooted trees in which 
every nonleaf node has degree 2 exactly as, for instance, in the first two drawings 
below: 
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BRAD 


In the second case, the leaves have been distinguished by ‘O’. The pruned binary tree 
(third representation) is obtained from a regular binary tree by removing the leaves. A 
binary tree can be fully reconstructed from its pruned version, and a tree of size 2n + 1 
always expands a pruned tree of size n. 

A few major classes are encountered throughout this book. Here is a summary?. 


General plane trees (Catalan trees) G = Z x SEQ{G} (unlabelled) 
Binary trees A=Z+4+(ZxAx A) (unlabelled) 
Nonempty pruned binary trees B=Z+2(Z x B)+(Z x Bx B) (unlabelled) 
Pruned binary trees C=1+4+(2x Bx B) (unlabelled) 
General nonplane trees (Cayley trees) T= ZxSet{T} (labelled) 


The corresponding GFs are respectively 


_1-vil-—4z _ 1-vil— 42? _ 1-2z-vl1—4z 


G(2) = —S—, AW) = Be) =. 
64225 —. T(z) = ze™™, 


being respectively of type OGF for the first four and EGF for the last one. The corre- 
sponding counts are 


1 /2n —2 1 2v 1 2n 
Gn =- ’ Aoy = ’ By = > 1), 
=?) Pe —(") —(*") ie ) 


1 2 
= a T=, 
n+1\n 
The common occurrence of the Catalan numbers, (C;, = By, = Aonti = Gn4+1) is 
explained by pruning and by the rotation correspondence described on p. 69. 


2 The term “general” refers to the fact that no degree constraint is imposed. 


APPENDIX B 


Basic Complex Analysis 


This appendix contains entries arranged in alphabetical order regarding the following topics: 
Algebraic elimination; Equivalent definitions of analyticity; Gamma function; Holo- 
nomic functions; Implicit Function Theorem; Laplace’s method; Mellin transform; 
Several complex variables. 

The corresponding notions and results are used in particular starting with Part B, which is 

relative to Complex Asymptotics. 


1. Algebraic elimination. Auxiliary quantities can be eliminated from systems of 
polynomial equations. In essence, elimination is achieved by suitable combinations of 
the equations themselves. One of the best strategies is based on Grébner bases and is 
presented in the excellent book of Cox, Little, and O’ Shea [104]. This entry develops 
a more elementary approach based on resultants. 


Resultants. Consider a field of coefficients IK which may be specialized as 
Q,C,C(z),..., as the need arises. A polynomial of degree d in K[z] has at most 
d roots in K and exactly d roots in the algebraic closure K of IKK. Given two polyno- 
mials, 


L m 
Pas S a> 2QG)a Spe 
j=0 k=0 


their resultant (with respect to the variable x) is the determinant of order (¢ + m), 


ag a, a2 vee 0 0 
ag a, —o 0 0 
_ 0 0 0 7 Gi ag 
(1) R(P, Q, 2) = det bo by ba ee 0 0 ’ 
0 bo by 0 0 
0:90: SO et ihe ae he, 


also called the Sylvester determinant. By its definition, the resultant is a polynomial 
form in the coefficients of P and Q. The main property of resultants is the follow- 
ing: (i) If P(x), Q(x) € K{z] have a common root in the algebraic closure K of K, 
then R(P(x), Q(x),x) = 0. (ti) Conversely, if R( P(x), Q(x), x) = 0 holds, then 
either ay = bo = 0 or else P(x), Q(x) have a common root in K. [The idea of the 
proof of (i) is as follows. Let S' be the matrix in (1). Then the homogeneous linear 
system Sw = 0 admits a solution w = (€°+™~1,..., €?, €, 1) where € is a common 
root of P and Q; this is only possible if det(.S) = R vanishes.] See especially van 
der Waerden’s crips treatment in [480] and Lang’s treatise [327, V.10] for a detailed 
presentation of resultants 
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Equating the resultant to 0 thus provides a necessary condition for the existence 
of common roots, but not always a sufficient one. This has implications in situations 
where the coefficients a;,b,; depend on one or several parameters. In that case, the 
condition R(P, Q, x) = 0 will certainly capture all the situations where P and Q have 
a common root, but it may also include some situations where there is a reduction in 
degree, although the polynomials have no common root. For instance, take P(x) = 
ta — 2 and Q(x) = ta? — 4 (with t a parameter); the resultant with respect to x is 
found to be 

R = 4t(1— ¢). 
Indeed, the condition R = 0 corresponds to either a common root (t = 1 for which 
P(2) = Q(2) = 0) or to some degeneracy in degree (¢ = 0 for which P(x) = —2 and 
Q(x) = —4 have no common zero). 


Systems of equations. Given a system 


(2) {P3(Z, y1s Yas + -+5 Ym) = O}, j=l..m, 


defining an algebraic curve, we can then proceed as follows in order to extract a sin- 
gle equation satisfied by one of the indeterminates. By taking resultants with P,,, 
eliminate all occurrences of the variable y,, from the first m — 1 equations, thereby 
obtaining a new system of m — 1 equations in m — 1 variables (with z kept as a pa- 
rameter, so that the base field is C(z)). Repeat the process and successively eliminate 
Ym—1,--+;Y2. The strategy (in the simpler case where variables are eliminated in 
succession exactly one at a time) is summarized in the skeletton procedure Eliminate: 


procedure Eliminate (P,,..., Pin, Y1, Y2,---Ym)3 

{Elimination of y2,..., Ym by resultants} 

(Ai, eee ,Am) = (Pi, es :{Ps)s 

for j from m by —1 to 2 do 

for k from j — 1 by —1 to 1 do 

Ax = R(Ax, Aj, 95); 

return(A,). 
The polynomials obtained need not be minimal, in which case, one should appeal 
to multivariate polynomial factorization in order to select the relevant factors at each 
stage. (Groebner bases provide a neater alternative to these questions, see [104].) 

Computer algebra systems usually provide implementations of both resultants and 

Groebner bases. The complexity of elimination is however exponential in the worst- 
case: degrees essentially multiply, which is somewhat intrinsic as yo in the quadratic 
system of & equations 


Yo — Z— Yk =0, Ye — YR_1 =, -.. .y— YG = 0 


(determining the OGF of regular trees of degree 2") represents an algebraic function 
of degree 2" and no less. 


> IL1. Resultant and roots. Let P,Q € C[x] have sets of roots {aj} and {G; } respectively. 
Then 


Lom m 
R(P,Q, x) = ands’ |] [ [ (a — 83) = a0 [J O(a). 
j=l i=l 


i=1j 
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The discriminant of P classically defined by D(P) := ag | R(P(a), P’ (x), x) satisfies 
D(P) = a5’ R(P(2), P’(x), x) = a3? [] (ai — a5). 
ij 
Given the coefficients of P and the value of D(P), there results an effectively computable 
bound on the minimal separation distance 6 between any two roots of P. [Hint. Let 
A = 1+ max;({a;/ao|). Then each a; satisfies |a;| < mA. Set LD = eae Then 
5 > Jao? |D(P)|(2A)"™] 


2. Equivalent definitions of analyticity. Two parallel notions are introduced at the 
beginning of Chapter IV: analyticity (defined by power series expansions) and holo- 
morphy (defined as complex differentiability). As is known from any textbook on 
complex analysis, these notions are equivalent. Given their importance for analytic 
combinatorics, this appendix entry sketches a proof of the equivalence, which is sum- 
marized by the following diagram: 


[A] 
Analyticity C-differentiability 
[C] { [B] 


Null integral Property 


A. Analyticity implies complex-differentiability. Let f(z) be analytic in the disc 
D(z; R). We may assume without loss of generality that z. = 0 and R = 1 (else 
effect a linear transformation on the argument z). According to the definition of ana- 
lyticity, the series representation 


(3) Fey =e fae 
n=0 


converges for all z with |z| < 1. Elementary series rearrangements first entail that 
f(z) given by this representation is analytic at any z, interior to D(0;1). Similar 
techniques then show the existence of the derivative as well as the fact that the deriv- 
ative can be obtained by term-wise differentiation of (3). 


> II.2. Proof of |A]: Analyticity implies differentiability. First, formally, the binomial theorem 
provides 


fA] Re = Yast a)" 


n>=0 n>0 


(4) =e a fozi(z— a)" 
n>0 k=0 
= > Cm(z— 21)", Cm => 2 Gal 
m>0 k>0 


Let r; be any number smaller than 1 — |z1 |. We observe that (4) makes analytic sense. Indeed, 
one has the bound | f,| < CA”, valid for any A > 1 and some C’ > 0. Thus, the terms in (4) 
are dominated in absolute value by those of the double series 


- n n kon-k _ n = C 
(5) E> (fJea ja\*r?-* =C > A™(lzal +71) = aera 


n>0k=0 n>0 


which is absolutely convergent as soon as A is chosen such that A < (|z1| + 11)7?. 
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Complex differentiability of at any z: € D(0; 1) derives from the analogous calculation, 
valid for small enough 6, 


=(f(1 +8) fla))) = Sonfat +8Y 5 (f) ctor 


(6) n>0 n>0 k=2 
7 See : + O(8), 
n>0 
where boundedness of the coefficient of 6 results from an argument analogous to (5). dq 


The argument of Note 2 has shown that the derivative of f at z; is obtained by 
differentiating termwise the series representing f. More generally derivatives of all 
orders exist and can be obtained in a similar fashion. In view of this fact, the equalities 
of (4) can also be interpreted as the Taylor expansion (by grouping terms according to 
values of k first): 


2 
a) fle +6) = fla) +f") + SH" a) to 


which is thus generally valid for analytic functions. 


B. Complex differentiability implies the “Null Integral” Property. The Null Inte- 
gral Property relative to a domain 22 is the property: 


[rms for any loop A Cc Q. 
» 


(A loop is a closed path that can be contracted to a single point in the domain Q), cf 
Chapter IV). Its proof results simply from the Cauchy-Riemann equations and from 
Green’s formula. 


> IL3. Proof of |B]: the Null Integral Property. This starts from the Cauchy—Riemann equa- 
tions. Let P(x, y) = Rf (ax + ty) and Q(z, y) = Sf(x + iy). By adopting successively in the 
definition of complex differentiability 6 = h and 6 = ih, one finds P;, + iQ, = Q', — iPj, 
implying 


(8) oP = 0Q and oP — ee 

Ox Oy Oy Ox 
known as the Cauchy—Riemann equations. (The functions P and Q satisfy the partial differ- 
ential equations Af = 0, where A is the 2-dimensional Laplacian A := & + oo such 


functions are known as harmonic functions.) The Null Integral Property, given differentiabil- 
ity, results from the Cauchy—Riemann equations, upon taking into account Green’s theorem of 


multivariate calculus, 
OB 
Adz + Bd =f f. (2 oA) cay, 
bie ‘: dz dy)“ 


which is valid for any (compact) domain K enclosed by a simple curve OK. <q 
C. Complex differentiability implies analyticity. The starting point is the formula 


(9) fla) = 5 ff ae, 


2i7 Z-GQ 


knowing only differentiability of f and its consequence, the Null Integral Property 
(but precisely not postulating the existence of an analytic expansion). There y is a 
simple positive loop encircling a inside a region where f is analytic. 
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> IL4. Proof of [C]: the integral representation. The proof of (9) is obtained by decomposing 
f(z) in the original integral as f(z) = f(z) — f(a) + f(a). Define accordingly 


I@-f) for z#a 
qe) = pie R 
{ f'(a) for z=a. 


By the differentiability assumption, g is continuous and holomorphic (differentiable) at any 
point other than a. Its integral is thus 0 along y. On the other hand, we have 


: : dz = 2ir, 
y za 


by a simple computation: deform 7 to a small circle along a and evaluate the integral directly 
by setting z—a = re’. <q 


Once (9) is granted, it suffices to write, e.g., for an expansion at 0, 


fe) = af 1055 


2 dt 
_ Xf H0 (5 +t: ys 


- do faz” 2 ' ee iios 


n>0 


(Exchanges of integration and summation are justified by normal convergence.) An- 
alyticity is thus proved from complex-differentiability and its consequence the Null 
Integral Property. 

> IL5. Cauchy’s formula for derivatives. One has 


mea) 7 £2) 
f! 0) = 35 | Goan © 


This follows from (9) by differentiation under the integral sign. dq 


> IL.6. Morera’s Theorem. Suppose that f is continuous [but not a priori known to be differ- 
entiable] in an open set 2 and that its integral along any triangle in ( is 0. Then, f is analytic 
(hence holomorphic) in 2. [For a proof, see, e.g, [402, p. 68].] 


3. Gamma function. The formule of singularity analysis in Chapter IV involve the 
Gamma function in an essential manner. The Gamma function extends to nonintegral 
arguments the factorial function and we collect in this appendix a few classical facts 
regarding it. Proofs may be found in classic treatises like Henrici’s [265] or Whittaker 
and Watson’s [492]. 


Basic properties. Euler introduced the Gamma function as 
(10) T(s) =| a ai, 
0 


where the integral converges provided R(s) > 0. Through integration by parts, one 
immediately derives the basic functional equation of the Gamma function, 


(11) T(s +1) = sI(s). 
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FIGURE II.1. A plot of '(s) for real s. 


Since ['(1) = 1, one has [(n + 1) = nl, so that the Gamma function serves to 
extend the factorial function for nonintegral arguments. For combinatorial purposes, 
the special value, 


1 ro dt PO oe 
(12) T (5) = | e t—_=2 e* dx = /n, 
0 vi 0 


proves to be quite important. It implies in turn T(-3) = 2/7. 

From (11), the Gamma function can be analytically continued to the whole of C 
with the exception of poles at 0, —1, —2,.... The functional equation used backwards 
yields 


(s es —m), 


so that the residue of ['(s) at s = —m is (—1)™/m!. Figure 1 depicts the graph of 
I'(s) for real values of s. 


> I1.7. Evaluation of the Gaussian integral. Define J := Se: e-* dx. The idea is to 


) 
fe ef i e +9") dedy, 
0 0 


yu 


evaluate J?: 


Going to polar coordinates, (x? + y’) 


change of variables formula: 
J = iZ er pdpdée. 
o Jo 


The equality J? = 1/4 results. dq 


=p, & = pcos#, y = psin 98 yields, via the standard 


Hankel contour representation. Euler’s integral representation of '(s) used in 
conjunction with the functional equation permits us to continue I'(s) to the whole of 
the complex plane. A direct approach due to Hankel provides an alternative integral 
representation valid for all values of s. 
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Theorem B.1 (Hankel’s contour integral). Let ioe denote an integral taken along 
a contour starting at +00 in the upper plane, winding counterclockwise around the 
origin, and proceeding towards +00 in the lower half plane. Then, for all s € C, 


13 + si rd ee ere he i) "6 dt 
(13) ~ sin(ms) ( =a sce | € : 


In (13), (—t)~§ is assumed to have its principal determination when t is negative real, 
and this determination is then extended uniquely by continuity throughout the contour. 


The integral then closely resembles the definition of ['(1 — s). The first form of (13) 


can also be rewritten as Tay by virtue of the complement formula given below. 


> IL8. Proof of Hankel’s representation. We refer to volume 2 of Henrici’s book [265, p. 35] 
or Whittaker and Watson’s treatise [492, p. 245] for a detailed proof. 

A contour of integration that fulfills the conditions of the theorem is typically the contour 
H that is at distance 1 of the positive real axis comprising three parts: a line parallel to the 
positive real axis in the upper half—plane; a connecting semi-circle centered at the origin; a line 
parallel to the positive real axis in the lower half-plane. More precisely, H = H7~ UHtUH®, 
where 


H = {z=w-i,w>0} 
(14) Ht = {z=wti, w>0} 
Ho = {z=-e'?, de [-4, Ss}. 


Let € be a small positive real number, and denote by € - 1 the image of 1 by the trans- 
formation z +> ez. By analyticity, for the integral representation, we can equally well adopt as 
integration path the contour € - H, for any « > 0. The main idea is then to let € tend to 0. 

Assume momentarily that s < 0. (The extension to arbitrary s then follows by analytic 
continuation.) The integral along « - 71 decomposes into three parts: 


The integral along the semi-circle is 0 if we take the circle of a vanishing small 
radius, since —s > 0. 
The contributions from the upper and lower lines give, as €e — 0 


(0) oo 
/ (+t) *e *dt= (SU +2) ) a mee 
+00 0 


where U and L denote the determinations of (—1)~° on the half-lines lying in the 
upper and lower half planes respectively. 
By continuity of determinations, U = (e~’")~* and L = (et'")~*. Therefore, the right hand 
side of (13) is equal to 


Gaal A A pies sins) 4 us 
27 T 
which completes the proof of the theorem. dq 


Expansions. The Gamma function has poles at the nonpositive integers but has 


no zeros. Accordingly, 1/T'(s) is an entire function with zeros at 0,—1,..., and the 
position of the zeros is reflected by the product decomposition, 
1 a 8 
15 —=.467s aes 1) 
(15) Ts) se U (1+ 7 Je 


(of the so-called WeierstraB type). There y = 0.57721 denotes Euler’s constant 


Co 


y= jim, (H,, — logn) = SS + —log(1 + | . 


n=1 
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The logarithmic derivative of the Gamma function is classically known as the psi 
function and is denoted by ~(s): 


W(s) := + tog t(s) = 


In accordance with (15), ¢)(s) admits a partial fraction decomposition 


(16) we+=—1->-| =~ 3]: 


n+s8s n 


From (16), there results that the Taylor expansion of w(s + 1), hence of I'(s + 1), 
involves values of the Riemann zeta function, 


ul 
n=1 
at the positive integers: for |s| < 1, 


w(s +1) =-y+ DO (-)*C(n)s**. 


so that the coefficients in the expansion of ['(s) around any integer are polynomi- 
ally expressible in terms of Euler’s constant y and values of the zeta function at the 
integers. For instance, as s — 0, 


rs+y=1-7s+(S+F) 24 (-O sae X) s$ +0664) 


12 2 3 12 6 


Another direct consequence of the infinite product formule for ['(s) and sin 7s is 
the complement formula for the Gamma function, 


(17) T(s)P(-s) =-—— 


ssints’ 
which directly results from the factorization of the sine function (due to Euler), 


In particular, Equation (17) gives back the special value (cf (12)): T(3) = /T. 
> II.9. The duplication formula. This is 


1 : 
27s! 1(s)P(s + = n/?T(2s), 
which provides the expansion of I’ near 1/2: 


1 n?/? re (7 +2 log 2)? x1/? 


T(s-+ 3) =? — (7 +2 log) ns + ( zi 5 ) 016%) 


The coefficients now involve log 2 as well as zeta values. dq 
Finally, a famous and absolutely fundamental asymptotic formula is Stirling’s 

approximation, familiarly known as “Stirling’s formula”: 

1 139 | 


1 
r Tyee Vwireteo hrs: dee Se 
ae kat 75 * ios * 28852 5184083 
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It is valid for (large) real s > 0, and more generally for all s > co in| Arg(s)| < 1—6 
(any 6 > 0). For the purpose of obtaining effective bounds, the following quantitative 
relation [492, p. 253] often proves useful, 


T(s +1) = s°e78(2ms)/2¢9/ (28) | where 0 < 0 = @(s) <1, 


an equality that holds now for all s > 1. Stirling’s formula is usually proved by 
appealing to the method of Laplace applied to the integral representation for [(s + 
1), see APPENDIX B: Laplace’s method, p. 700, or by Euler-Maclaurin summation 
(Note 10). It is derived by different means in APPENDIX B: Mellin transform, p. 707. 
> IL.10. Stirling’s formula via Euler-Maclaurin summation. Stirling’s formula can be derived 
from Euler-Maclaurin summation applied to log '(s). [See: [248, Sec. 9.6].] J 


> IL11. The Eulerian Beta function. It is defined for R(p), R(q) > 0 by any of the following 
integrals, 


al co p-1 =z 

- - y 2 2p-1 gy gi 2q-1 

B(p,q) =f z?'(1—a)! ‘ae = f a y=2/ cos?” * @ sin“? ~~ 6d, 
0 a Gayers 0 


where the last form is known as a Wallis integral. It satisfies: 


Tor@ 
B(p,q) = —————. 
(p,q) Tip +4) 
[See [492, p. 254] for a proof generalizing that of Note 7.] J 


4. Holonomic functions. Doron Zeilberger [513] has introduced discrete mathemati- 
cians to a powerful framework, the holonomic framework, which takes its roots in 
classical differential algebra [58, 102] and has found innumerable applications in the 
theory of special functions and symbolic computation [393], combinatorial identi- 
ties, and combinatorial enumeration. In these pages, we can only offer a (too) brief 
orientation tour of this wonderful theory. Major contributions in the perspective of 
Analytic Combinatorics are due to Stanley [446], Zeilberger [513], Gessel [234], and 
Lipshitz [334, 335]. As we shall see there is a chain of growing generality and power, 


rational — algebraic — holonomic. 


The associated asymptotic problems are examined in Subsection VII. 9.1, p. 493. 
Univariate holonomic functions. Holonomic fuctions! are solutions of linear 

differential equations or systems whose coefficients are rational functions. The uni- 

variate theory is elementary. 

Definition B.1. A formal power series (or function) f(z) is said to be holonomic if it 

satisfies a linear differential equation, 


ig r—1 


(18) eo) F(2) Fea) Ale) ++ Fevle) F(Z) = 0, 


where the coefficients c;(z) lie in the field C(z) of rational functions. Equivalently, 
f () is holonomic if the vector space over C(z) spanned by the set of all its derivatives 
{0 f (z)}32o is finite dimensional. 


aN synonymous name is O-finite or D-finite. 
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By clearing denominators, we can assume, if needed, the quantities c;(z) in (18) 
to be polynomials. It then follows that the coefficient sequence (f,,) of a holo- 
nomic f(z) satisfies a recurrence, 


(19) Cs(N) fn+s + Can) ie o4 Pee + €o(n) fn = 0, 


for some polynomials ¢;(n), provided n > no (some no). Such a recurrence (19) is 
known as a P-recurrence. (The two properties of sequences, to be the coefficients of 
a holonomic function and to be P-recursive, are equivalent.) 

Functions like e*, log z,cos(z), arcsin(z), V1 + z, and Lig(z) = 0,5, 2"/n? 
are holonomic. Formal power series like S> 2"/(n!)? and S>n!z” are holonomic. 
Sequences like wT Ce") , 2” /(n? + 1) are coefficients of holonomic functions and are 
P-recursive. However, sequences like ,/n, log n are not P-recursive, a fact that can be 
proved by an examination of singularities” of associated generating functions [182]. 
For similar reasons, tan z and sec z that have infinitely many singularities are not 
holonomic. 

Holonomic functions enjoy a rich set of closure properties. Define the Hadamard 
product of two functions h = f © g to be the termwise product of series: [z"|h(z) = 


(l2"]F(2)) - ([2"]g(2)). We have: 


Theorem B.2 (Univariate holonomic closure). The class of univariate holonomic 
functions is closed under the following operations: sum (+), product (x), Hadamard 
product (©), differentiation (0.), indefinite integration ([*), and algebraic substitu- 
tion (z+ y(z) for some algebraic function y(z)). 


Proor. An exercise in vector space manipulations. For instance, let VS(0* f) 
be the vector space over C(z) spanned by the {0Jf}. If h = f+g(orh = 
f +g), then VS(0*h) is finite dimensional since it is included in the direct sum 
VS(0* f) 6 VS(0*g) (respectively the tensor product VS(0* f) @ VS(0*g)). For 
Hadamard products, if hn = fngn, then a system of P-recurrences can be obtained 
for the quantities nf DM fn+ign+j from the recurrences satisfied by fn, gn, and 
then a single P-recurrence can be obtained. Closure under algebraic substitution re- 
sults from the methods of Note 12. See Stanley’s historic paper [446] and the book 


chapter [449, Ch. 6] for details. 


> 11.12. Algebraic functions are holonomic. Let y(z) satisfy P(z, y(z)) = 0, with P a poly- 
nomial. Any nondegenerate rational fraction Q(z, y(z)) can be expressed as a polynomial 
in y(z) with coefficients in C(z). [Proof: let D be the denominator of @; the Bezout relation 
AP — BD = 1 (in C(z)[y]), obtained by a ged calculation between polynomials (in y), ex- 
presses 1/D as a polynomial in y.] Then, all derivatives of y live in the space spanned over 
C(z) by 1,y,...,y?~', with d = deg, P(z,y). (The fact that algebraic functions are holo- 
nomic was known to Abel, and an algorithm has been described in recent times by Comtet [97].) 
The closure under algebraic substitutions (y +> y(z)) asserted in Theorem B.2 can be estab- 
lished along similar lines. <q 


Zeilberger observed that holonomic functions with coefficients in Q can be spec- 
ified by a finite amount of information. Equality is this subclass is then decidable: 


2Singularities of holonomic functions, and more generally of solutions to meromorphic differential 
equations are studied in Subection VII. 9.1, p. 493. 
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Algorithm Z: Decide whether two holonomic functions A(z), B(z) are equal 

Let 4, T be holonomic descriptions of A, B (by equations or systems); 

Compute a holonomic differential equation Y for h := A — B; 

(Or simply determine an upperbound e on the order of YT.) 

Output ‘equal’ iff h(0) = h’(0) = --- = h“°-) (0) = 0, with e the order of Y. 
The book titled “A = B” by PetkovSek, Wilf, and Zeilberger [393] abundantly il- 
lustrates this method to combinatorial and special function identities. Interest in the 
approach is reinforced by the existence of powerful symbolic manipulation systems: 
Salvy and Zimmermann [427] have implemented univariate algebraic closure oper- 
ations; Chyzak and Salvy [90, 92] have developed algorithms for multivariate holo- 
nomicity discussed below. 


EXAMPLE II.1. = The Euler-Landen identities for dilogarithms. Let as usual Li,(z) := 
Msg f n? represent the polylogarithm function. Around 1760, Landen and Euler discov- 
ered the dilogarithmic identity [42, p. 247], 


(20) by (-+) = -5 log?(1 — z) — Lia(z), 


which corresponds to the (easy) identity on coefficients (extract [z"]) 


meta . at 
en) i) k2 ie ey 


k=1 


and specializes (at z = 4) to the infinite series evaluation 
1 1 ee 
Lig| ~) = ——— = —- -1 2. 
e (3) ror i oe 


Write A and B for the left and right sides of (20), respectively. The differential equations for 
A, B are built in stages, according to closure properties: 


(22) 
Lii(z): (1 — z)d?y — dy =0 
Lis (z)? : (1 — z)°d°y + 3(1 — z)d?y + dy =0 
Lia(z) : z(1 — z)O8y + (2 — 3z)0°y — dy =0 
B(z): — 23(362° +--+. — 880)(1 — z)8d°y + --- — 48(2252° + --- +1240)dy = 0 
A(z): 2(1 — z)°O°y + (1 — z)(2— 5z)0?y — (3 — 4z)dy = 0 


Thus, A — B lives a priori in a vector space of dimension 12 = 3 + 9. It thus suffices to check 
the coincidence of the expansions of both members of (20) till order 12 in order to prove the 
identity A = B. (An upper bound on the dimension of the vector space is actually enough.) 
Equivalently, given the automatic computations of (22), it suffices to verify the particular cases 
of the identity (21) in order to have a complete proof of it. ...... END OF EXAMPLE II.1. 


> 11.13. Holonomic functions as solutions of systems. (This is a simple outcome of Note 41, 
p. 496.) A holonomic function y(z) which satisfies a linear differential equation of order m with 
coefficients in C(z) is also the first component of a first-order differential system of order m 
with rational coefficients: y(z) = Yi(z), where 


£¥;(2) 5. “ap Gche aOR) 
(23) 
ee ee ee ere ae 


dz 
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where each a;,;(z) is a rational function. Conversely, any solution of a system (23) with the 
ai,j € C(z) is holonomic. <J 


> IL14. The Laplace transform. Let f(z) = 37,59 fnz” be a formal power series. Its Laplace 
transform g = L{f] is defined as the formal power series: 


LL) = So nl fae”. 
n=0 
(Thus Laplace transforms convert EGFs into OGFs.) Under suitable convergence conditions, 
the Laplace transform is analytically representable by 


LUfl(x) = | eta! 


0 
The following property holds: A series is holonomic if and only if its Laplace transform is 
holomic. {Hint: use P-recurrences (19).] J 


> IL.15. Hypergeometric functions. Let (a)n represent the falling factorial a(a — 1)--- (a — 
n +1). The function of one variable, z, and three parameters, a, b, c, defined by 


coe PENT @nlbn) 2" 
(24) PLE Dias at 
is known as a hypergeometric function. It satisfies the differential equation 
dy dy 
(25) 21 — z)7 > + (e— (a +b+1)z) 7 — aby =0, 


and is consequently a holonomic function. An accessible introduction appears in [492, Ch XIV]. 
The generalized hypergeometric function (or series) depends on p + q parameters 
G1,--.,@p and c1,...,Cq, and is defined by 


(26) Mivadgtnsse ate) os ee 


so that F’ in (24) is a 2 fF. Hypergeometric functions satisfy a rich set of identities [153, 438], 
many of which can be verified (though not discovered) by Algorithm Z. dq 

Multivariate holonomic functions. Let 2 = (21,...,2%m) be a collection of 
variables and C(z) the field of all rational fractions in the variables z. For n = 
(n1,..-,%m) a vector of integers, we define z” to be zy --+ zm and let O” repre- 
sent 0,1 +++ O,nm. 


Definition B.2. A multivariate formal power series (or function) f(z) is said to be 
holonomic if the vector space over C(z) spanned by the set of all derivatives {0” f (z)} 
is finite dimensional. 


Since the partial derivatives 02 _f are bound, a multivariate holonomic function 
satisfies a differential equation of the form 


f(@) +--+ + Gir F(Z) = 9, 


ry 
Oz," 
and similarly for z2,...,Zm. (Any system of equations with possibly mixed partial 
derivatives that allows one to determine all partial derivatives in terms of a finite num- 
ber of them serves to define a multivariate holonomic function.) Denominators can be 
cleared, upon multiplication by the |.c.m of all the denominators that figure in the sys- 
tem of defining equations. There results that coefficients of multivariate holonomic 


C10 (z) 
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functions satisfy particular systems of recurrence equations with polynomial coeffi- 
cients, which are characterized in [335]. 

Given f(z) viewed as a function of 21, z2 (with the remaining variables treated 
as parameters) and abbreviated as f(z1, 22), we define the diagonal with respect to 
variables z1, 22 as 


Disgy.s (fis) = Yo towel, where: Fines) = So fanaeier": 


v n1,n2 


The Hadamard product is defined like in the univariate case, with respect to a specific 
variable (e.g., 21). 


Theorem B.3 (Multivariate holonomic closure). The class of multiivariate holonomic 
functions is closed under the following operations: sum (+), product (x), Hadamard 
product (©), differentiation (0), indefinite integration (['), algebraic substitution, spe- 
cialization (setting some variable to a constant), and diagonal. 


An elementary proof of this remarkable theorem (in the sense that it does not 
appeal to higher concepts of differential algebra) is given by Lipshitz in [334, 335]. 
The closure theorem and its companion algorithms [90, 463] make it possible to prove, 
or verify, automatically identities, many of which are nontrivial. For instance, in his 
proof of the irrationality of the number ¢(3) = 5°, 5, 1/n°, Apéry introduced the 
combinatorial sequence, ~ 


n 2 2 
n nt+k 
27 An, = ’ 
for which a proof was needed [478] of the fact that it satisfies the recurrence 


(28) (n+1)°B, + (n+ 2)? Bnyo — (Qn + 3)(17n? + 51n + 39)Bn41 = 0, 


with B, = 5, By = 73. Obviously, the generating function B(z) of the sequence 
(B,,) as defined by the P-recurrence (28) is univariate holonomic. Repeated use of 
the multivariate closure theorem shows that the ordinary generating function A(z) of 
the sequence A, of (28) is holonomic. (Indeed, start from the explicit 


Tha Vesa 1 m+ne2)\ pn, n 1 
ye ("ne = :> ( id rer =, 
n1,n2 n1,n2Q 

and apply suitable Hadamard products and diagonal operations.) This gives an ordi- 
nary differential equation satisfied by A(z). The proof is then completed by checking 
that A, and B,, coincide for enough initial values of n. 

Holonomic functions in infinitely many variables. Let f be a power series in 
infinitely many variables 71, x%2,.... Let S C Z> , be a subset of indices. We write 
fs for the specialization of f in which all the variables whose indices do not belong 
to S are set to 0. Following Gessel [234], we say that the series f in infinitely many 
variables is holonomic if, for each finite S, the specialization fy is holonomic (in the 
variables x, for s € S). Gessel has developed a powerful calculus in the case of 
series f that are symmetric functions, with stunning consequences for combinatorial 
enumeration. 
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An undirected graph is called k-regular if every vertex has exact degree k. A 
standard Young tableau is the Ferrers diagram of an integer partition, filled with con- 
secutive integers in a way that is increasing along rows and columns. The classical 
Robinson-Schensted-Knuth correspondence establishes a bijection between permuta- 
tions of size n and pairs of Young tableaus of size n having the same shape. The 
common height of the tableaus in the pair associated to a permutation o coincides 
with the length of the longest increasing subsequence of 0. A k x n Latin rectangle 
isa k x n matrix with elements in the set {1,2,...,} such that entries in each row 
and column are distinct. (It is thus a k-tuple of “discordant” permutations.) 

Gessel’s calculus [233, 234] provides a unified approach to establishing the holo- 
nomic character of many generating functions associated with combinatorial struc- 
tures like: Young tableaus, permutations of uniform multisets, increasing subse- 
quences in permutations, Latin rectangles, regular graphs, marices with fixed row and 
matrix sum, and so on. For instance: the generating functions of Latin rectangles and 
Young tableaux of height at most k, of k-regular graphs, and of permutations whose 
longest increasing subsequence is of length k are holonomic functions. In particular, 
the number Y,, ;, of permutations of size n with longest increasing subsequence < k 
satisfies 


lee) 
gentry 


y2n 

(29) So Yuk Ge = det [ji—j)(22)] ,; ;<, > where 1(2z) = D> AGED! 
n>0 n=0 

that is, a corresponding GF is expressible as a determinant of Bessel functions. Other 

applications are described in [91, 363]. 


5. Implicit Function Theorem. In its real variable version, the implicit function 
theorem asserts that, for a sufficiently smooth function F(z, w) of two variables, a 
solution to the the equation F'(z,w) = 0 exists in the vicinity of a solution point 
(20, Wo) (therefore satisfying F'(z9, wo) = 0) provided the partial derivative satisfies 
F(z, Wo) # 0. This theorem admits a complex-analytic extension, which is essen- 
tial for the analysis of recursive structures. 

Without loss of generality, one restricts attention to (zo, wo) = (0,0). We con- 
sider here a function F(z, w) that is analytic in two complex variables in the sense 
that it admits a convergent representation valid in a polydisc, 


(30) Pais Steamers, ele (wes: 


for some R,S > 0 (cf APPENDIX B: Several complex variables., p. 712). 

Theorem B.4 (Analytic Implicit Functions). Let F be bivariate analytic near (0,0). 
Assume that F'(0,0) = fo,o = Oand F’,(0,0) = fo 4 0. Then, there exists a unique 
function f(z) analytic in a neighbourhood |z| < p of 0 such that f (0) = 0 and 


F(z, f(z))=0, — lal <p. 
> II.16. Proofs of the Implicit Function Theorem. See Hille’s book [269] for details. 


(2) Proof by residues. Make use of the principle of the argument and Rouché’s Theorem 
to see that the equation F(z, w) has a unique solution near 0 for |z| small enough. Appeal then 
to the related result of Chapter IV (based on the residue theorem) that expresses the sum of the 
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solutions to an equation as a contour integral. Here, this expresses the solution as (C’ a small 
enough contour around 0 in the w—plane) 


1 Fi,(z,w) 
Sos Hwa) 4 
fa) = se [ “F(z, w) 
which is checked to represent an analytic function of z. 
(ii) Proof by majorant series. Set G(z, w) := w — fo. 1 F(z, w). The equation F(z, w) = 
0 becomes the fixed-point equation w = G(z,w). The bivariate series G has its coefficients 
dominated termwise by those of 


a A w 


The equation w = G(z, w) is quadratic. It admits a solution f(z) analytic at 0, 
~ z  A(A?+AS +87) 2? 
= A= f ———_ — +... 
f(z) R + 92 R2 + : 
whose coefficients dominate termwise those of f. 


(iit) Proof by Picard’s method of successive approximants. With G' like before, define the 
sequence of functions 


po(z):=0;  $541(2) = G(z, 5 (2)), 


each analytic in a small neighbourhood of 0. Then f(z) can be obtained as 


f(z) = lim $;(2) = g0(z) — S> (¢5(2) — bj41(2)), 
j=0 


3 


which is itself checked to be analytic near 0 by the geometric convergence of the series. J 


Weierstrass Preparation. The Weierstrass Preparation Theorem (WPT) also 
known as Vorbereitungssatz is a useful complement to the Implicit Function Theo- 
rem. 

Given acollection Z = (z1,..., 2m) of variables, we designate as usual by C[[Z]] 
the ring of formal power series in indeterminates Z. We let C'{Z} denote the subset 
of these that are convergent in a neighbourhood of (0,...,0), ie., analytic (cf AP- 
PENDIX B: Several complex variables., p. 712). 


Theorem B.5 (Weierstra® Preparation). Let F = F'(z1,...,%m) in C[[Z]] (respec- 
tively C{Z}) be such that F(0,...,0) = 0 and F depends on at least one of the z; 
with j > 2 (i.e. f(0, 22,..., 2m) is not identically 0). Define a WeierstraB polynomial 
to be a polynomial of the form 


W(z) =2¢+qizt14+---+ 9a, 
where g; € C|[z2,.--, 2m] (respectively g; € C{z2,...,%m}), with g;(0,...,0) = 
0. Then, F admits a unique factorization 
F(21, 22,---;2m) = W(21)- X(21,..-, 2m), 
where W(z) is a WeierstraB polynomial and X is an element of C[[z1,..., Zm]] (re- 
spectively C{21,...,2m}) satisfying X(0,0...,0) £0. 


PROOF.[Sketch] An accessible proof and a discussion of the formal algebraic result 
are found in Abhyankar’s lecture notes [1, Ch. 16]. 
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The analytic version of the theorem is the one of use to us in this book. We prove 
it in the representative case where m = 2 and write F(z, w) for F'(z1, z2). First, the 
number of roots of the equation F(z, w) = 0 is given by the integral formula 


1 F(z, w) 


ep) Qin y F(z,w) 


dw, 
where 7¥ is a small contour encircling 0 in the w-plane. There exists a sufficiently small 
open set 22 containing 0 such that the quantity (31), which is an analytic function of z 
while being an integer, is constant, and thus equal to its value at z = 0, which we 
call d. The quantity d is the multiplicity of 0 as a root of the equation F'(0, w) = 0. 
In other words, we have shown that if F'(0, w) = 0 has d roots equal to 0, then there 
are d values of w near 0 (within +) such that F(z, w) = 0, provided z remains small 
enough (within 2). 

Let y1,..-,Ya be these d roots. Then, we have for the power sum symmetric 


functions, 
1 F' (z,w) 
r d wr\*) r 
Bh ei ee OES fe a a 
a oF ae F(z,w) 


which are analytic functions of z when z is sufficiently near to 0. There results from 
relations between symmetric functions (Note IJI.24, p. 177) that y1,...,y, are the 
solutions of a polynomial with analytic coefficients, W, which is a uniquely defined 
Weierstrass polynomial. The factorization finally results from the fact that F/W has 
removable singularities. 

In essence, Theorem B.5 implies that functions implicitly defined by a transcen- 
dental equation (an equation F’ = 0) are locally of the same nature as algebraic func- 
tions (corresponding to the equation W = 0). In particular, form = 2, when the 
solutions have singularities, these singularities can only be branch points and com- 
panion Puiseux expansions hold (Chapter VII). The theorem acquires even greater 
importance when perturbative singular expansions (corresponding to m > 3) become 
required for the purpose of extracting limit laws (Chapter IX). 


6. Laplace’s method. The method of Laplace serves to estimate asymptotically real 
integrals depending on a large parameter n (which may be a positive integer or real 
number). Though it is primarily a real analysis technique, we present it in detail in 
this appendix given its relevance to the saddle point method, which deals instead with 
complex contour integrals. 


Case study: a Wallis integral. In order to demonstrate the essence of the method, 
consider first the problem of estimating asymptotically the Wallis integral 


m/2 
(32) ii ‘i (cosx)” dz, 


as n — +00. The cosine attains its maximum at x = 0 (where its value is 1), and 
since the integrand of J, is a large power, the contribution to the integral outside any 
fixed segment containing 0 is exponentially small and can consequently be discarded 
for all asymptotic purposes. A glance at the plot of cos” x as n varies (Figure 2) also 
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suggests that the integrand tends to conform to a bell-shaped profile near the centre as 
n increases. This is not hard to verify: set 2 = w/,/n, then a local expansion yields 


2 


(33) cos” xz = exp(n log cos(x)) = exp (-= + o(ntw')) ; 


the approximation being valid as long as w = O(n/ 4). Accordingly, we choose 
(somewhat arbitrarily) 


Kn t= ie oe 


and define the central range by |w| < &,,. These considerations suggest to rewrite the 
integral I, as 


1 +rJ/n/2 n 
T, (cos =) dw, 


7 vn =m /n/2 vn 


and expect under this new form an approximation by a Gaussian integral arising from 
the central range. 


Laplace’s method proceeds in three steps: 
(i) Neglect the tails of the original integral; 
(it) Centrally approximate the integrand by a Gaussian; 
(itt) Complete the tails of the Gaussian integral. 


In the case of the cosine integral (32), the chain is summarized in Figure 3. Details of 
the analysis follow. 


(i) Neglect the tails of the original integral: By (33), we have 


a 1 
cos” (=) ~ exp (-5n'"") : 


and, as the integrand is unimodal, this exponentially small quantity bounds the inte- 
grand throughout |w| > «,, that is, on a large part of the integration interval. This 


FIGURE II.2. Plots of cos” x [left] and cos” (w/,/n) [right], forn = 1... 20. 
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[Neglect the tails] 


[Central approxim. | 


[Complete the tails] 


FIGURE II.3. A typical application of the Laplace method. 


gives 
1 pri 1 
(34) I, === cos" «dz +O | exp ( — <2 } J, 
Vl Jann / va 2 


and the error term is of the order of exp(—4n!/°). 


(it) Centrally approximate the integrand by a Gaussian: In the central region, we 


have 
thin / Vn 
ee — i, cos” x dx 
iin 
= " pw?/2 
= —= € exp (O(n~'w*)) dw 
(35) ue kn 


7 Kin enw 2/9 1 + O(n! w =) dw 
a an en ( ) 

_ & enw /2 g —3/2 
5 is w+O(n-*!*), 


given the uniformity of approximation (33) for w in the integration interval. 


(iit) Complete the tails of the Gaussian integral: The incomplete Gaussian inte- 
gral in (35) can be easily estimated once it is observed that its tails are small. Precisely, 
one has, for W > 0, 


as enw /2 dw < ew" /2 a eV) dh = 4/ Ta-Wep 
Ww 0 2 


(by the change of variable w = W + h). Thus, 


+Kn +oo 1 
(36) i ew? dy = i, e-¥'/2 dw +O (exp ( = sn) ) 


—Kn 


It now suffices to collect the three approximations, (34), (35), and (36): we have 
obtained in this way. 


(37) Li = af ew *?2 dw +O(n =e) = [27 — + O(n —3/2), 
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These three steps are the heart of Laplace’s method. 


In the asymptotic scale of the problem, the exponentially small errors in the tails 
can be completely neglected. The error in (37) then arises from the central approxi- 
mation (33), and its companion O(w*n—!) term. This can easily be improved and it 
suffices to appeal to further terms in the expansion of log cos x near 0. For instance, 
one has (a = w/./n): 

4 


(38) cos” a =e /2 (1-4 O(n-2w8) ) . 
12n 


Proceeding like before, we find that a further term in the expansion of J,, is obtained 
by considering the additive correction 


+00 4 
en i= a ew 2 (=) dw=-,/—, 


_ fan Tv —5/2 
In = f= \an3 + OM ). 


Clearly, a full asymptotic expansion can be obtained in this manner. 


> IL.17. Wallis integrals and central binomials. The integral J, is an integral considered by 
John Wallis (1616-1703). It can be evaluated through partial integration or by its relation to the 
Beta integral (Note 11) as In = ['(5)P'(4 + $)/P'(4 + 1). There results (n ++ 2n) 


2n Q?n dk 1 5 
~ ——(1-—+4+—~ + ——- .- .- ], 
n JT 8n = 128n? 1024n2 


which is yet another avatar of Stirling’s formula. dq 


so that 


General case of large powers. Laplace’s method applies under very general con- 
ditions to integrals involving large powers of a fixed function. 


Theorem B.6 (Laplace’s method). Let f and g be indefinitely differentiable real val- 
ued functions defined over some compact interval I of the real line. Assume that |\g(x)| 
attains its maximum at a unique point Xo interior to I and that f (xo), g(xo0), g" (x0) # 
0. Then, the integral 


In = f Flo)g(a)" ae 


admits a full asymptotic expansion: 


2 0; a 
6) In~ = F(wo)g(o)" Maa > Dee (20) 


> 11.18. Proof of Laplace’s Theorem. It follows exactly the steps explained above. Let us 
asume first that f(a) = 1. Then, one chooses kK» as a function tending slowly to infinity like 


before (kn = ni/10 is suitable). It suffices to expand 


zotkn n 
1) =) ORIN tog a2) da, 
rQo—-Kn/ Vn 
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as the difference [,, — I. ) is exponentially small. Set first x = xo + X and 
x? 
L(X) := log g(ao + X) — log g(ao) + Ana 


so that, with w = X/n, the central contribution becomes: 


1? = ae / ON gat gpl Ce TA) guy 
n = 


Then, it is possible to expand L(X) to any order M, 


Kn 


M-1 
L(X) = $5 GX? +O(X™), 
j=3 


and e"/(“/V™) admits a full expansion in descending powers of \/n: 


3 4, 92,6 
erbw/V™) Oy 4 fsw 2law) + fw 


wa oa 


There, by construction, the coefficient of n-*/? isa polynomial E(w) of degree 3k. This 
expression can be truncated to any order, resulting in 


n K M-1 3M 
a) _ g(%o) "naw? /2 3 Ex(w) lt+w 
_ k=1 


Kn 


One can then complete the tails at the expense of exponentially small terms since the Gaussian 
tails are exponentially small. 

The full asymptotic expansion is revealed by the following device: for any power series 
h(w), introduce the Gaussian transform, 


elf= fe pw) aw, 
0 
which is understood to operate by linearity on integral powers of w, 


lw] =1-3---Qr—-1)Vv2ar, lw?" *] =0. 
Then, the complete asymptotic expansion of J, is obtained by the formal expansion 


g(xo)" 29/2) 3 ee 1d F tetas ns ies 
(40) "a & [exp ( we yL(A wy) j L(X) = x3 L(X), y is 


The addition of the prefactor f(x) (omitted so far) induces a factor f (xo) in the in the main 
term of the final result and it affects the coefficients in the smaller order terms in a computable 
manner. Details are left as an exercise to the reader. | 


> IL19. The next term? One has (with f; := f (xo), ete): 


InVXn on —9A3 fo + 12d? fo + 12A figs + 3A fogs + 593 fo +00) 
V21g(x0)” ” 24d03n , 


which is best determined using a symbolic manipulation system. | 


The method is susceptible of a large number of extensions. Roughly it requires 
a point where the integrand is maximized, which induces some sort of exponential 
behaviour, local expansions then allowing for a replacement by standard integrals. 
> 1.20. Special cases of Laplace’s method. When f (xo) = 0, the integral normalizes to an 
integral of the form [ wee /?_ If g (ao) = 0 but g” (xo) # 0 then a factor P'(4) replaces 
the characteristic 7 = I'(5). [Hint: f>° exp(—w*)w® dw = 8-'T((a + 1)87').] If the 
maximum is attained at one end of the interval I = [a, b] while g'(ao) = 0, g’ (ao) # 0, then 
the estimate (39) must be multiplied by a factor of 3. If the maximum is attained at one end of 
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the interval I while g’(xo) # 0, then the right normalization is w = x/n and the integrand is 
reducible to an exponential e~ “”. Here are some dominant asymptotic terms: 


to#a,b g"(xo) £0, f(xo) =0 Dens g(xo)” (AF (wo) + f"(xo)g"" (xo) 
wo #a,b 9'(a0) = 0,9”(ao) #0 | 1(3) t/a 2_F(a0)g(ao)"  (A* = - 2) 
to=a — f(xo) £0,g'(xo) £0 ta f (x0) g(ao)"**. 


A similar analysis is employed in Chapter VIII, when we discuss coalscence cases of the saddle- 
point method. <q 


EXAMPLEII.2. Stirling’s formula via Laplace’s method. Start from an integral representation 


involving n!, namely, 
a e a dx = ——. 
0 nrtrl 


This is a direct case of application of the theorem, except for the fact that the integration interval 
is not compact. The integrand attains its maximum at x = 1 and the remainder integral te is 
accordingly exponentially small as proved by the chain 


/ ea" dx = eer | (1+ 5) ede [vr 242] 
2 co 
< geay f eMl2e—M® dy = = (267*)" flog(1 + 2/2) < 2/2]. 
0 


Then the integral from 0 to 2 is amenable to the standard version of Laplace’s method as stated 
in Theorem B.6 to the effect that 


1 
nl=n"e "V2Qrn (2 +O ()) : 
n 
The asymptotic expansion of J, derives from (40) and involves the combinatorial GF 


(41) H(z,u) := exp (u (ioe. St ees =)) 


The noticeable fact is that H(z, u) is the exponential BGF of permutations that are generalized 
derangements involving no cycles of length 1 or 2, with z marking size and w marking the 
number of cycles: 
H S EE, Cane SS Oe ee TRON ae ot 1, 2),6,/1 1 2),7 

(z,u) = ye nk = +guz +5 uz ts uz’ +(sutygu 2° +(sutggu zit. 

n,k>0 
Then, a full asymptotic expansion of J, is obtained by applying the Gaussian transform 6 to 
H(wy,—y~) (with y = n~!/?), resulting in 
= 1 at 139 
lane "V2 rs are fo 
aes am ( + Ton t 288n2 — B1840n3 ) 

Proposition B.1 (Stirling’s formula). The factorial function admits the complete asymptotic 
expansion as % — +00: 


“ c 
l= ~ are */ pol 
al!=T(@+1)~a%e Qu 1+5 oa 
q21 
24 


—1)* 

The coefficients satisfy cq = } SS Cpa where hn,~ counts the number of 
qd ! 

k=1 


permutations of size n having k cycles, all of length > 3. 


The derivation above is due to Wrench (see [98, p. 267]). ........ END OF EXAMPLE II.2. 
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The scope of the method goes much beyond the case of integrals of large pow- 
ers. Roughly, what is needed is a localization of the main contribution of an integral 
to a smaller range (“Neglect the tails”) where local approximations can be applied 
(“Centrally approximate”) . The approximate integral is then finally estimated by 
completing back the tails (“Complete the tails’). 

The Laplace method is excellently described in books by de Bruijn [111] and 
Henrici [265]. A thorough discussion of special cases and multidimensional integrals 
is found in the book by Bleistein and Handelsman [59]. Its principles are fundamental 
to the development of the saddle point method in Chapter VII. 


> 1.21. The classical proof of Stirling’s formula. This proceeds from the integral 


In =f e "a" dz (=n!) 
0 


The maximum of the integrand is at zo = n and the central range is now now n + kn/n. 
Reduction to a Gaussian integral follows, though the estimate is no longer an immediate case 
of application of Theorem B.6. dq 


Laplace’s method for sums. The basic principles of the method of Laplace (for 
integrals) can are often be recycled for the asymptotic evaluation of discrete sums. 
Take a finite or infinite sum S,, defined by 


Sp = se t(n, k). 


k 


A preliminary task consists in working out the general aspect of the family of num- 
bers {t(n, &)} for fixed (but large) n as k varies. In particular, one should locate the 
value kp = ko(n) of k for which t(n, &) is maximal. In a vast number of cases, tails 
can be neglected; a central approximation t(n, k) of t(n, k) for k in the “central” re- 
gion near kp can be determined, frequently under the form [remember that we use in 
this book ‘~’ in the loose sense of ’approximately equal’ ] 


t(n,k) & s(n) (—*) 


On 


There ¢ is some simple smooth function while s(n) and o,, are scaling constants. The 
quantity o,, indicates the range of the asymptotically significant terms. One may then 
expect 


Sn & sede (—*) , 


Then provided o,, — oo, one may further expect to approximate the sum by an inte- 
gral, which after completing the tails, gives 


Sy & s(ndon | o(t) dt. 
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Case study: Sums of powers of binomial coefficients. Here is, in telegraphic 
style, an application to sums of powers of binomial coefficients: 


+n In r 
Gr 
eS fe 2 | 


The largest term arises at kg = 0. Also, one has elementarily 


Come ee 


Gy Gages) 
Upon taking logarithms, using approximations of log(1+<), and exponentiating back, 
one finds 
2n 
k? 
(nti) = exp (-= + o(k’n™)) : 
Ge n 


This approximation holds for k = o(n?/ 3), where it provides a gaussian approxima- 
tion (d(x) = e7?2”) with a span of 0, = \/n. Tails can be neglected to the effect 


that 
es 9) ee (-=) ; 


say with |k| < n!/?«, where kK, = n'/ 10 Then approximating the Riemann sum by 
an integral and completing the tails, one gets 


9) r [oe 92rn 
Srw ( ") va | ere" dw, thatis, Sy? ~ (an)~ 9/2, 
n 00 vr 


(42) 


which is our final estimate. | 
> 11.22. Elementary approximation of Bell numbers. The Bell numbers counting set partitions 


are 
ey e° Tae = ent 
By = ni[z” a 


k=0 
The largest term occurs for k near e“ where u is the positive root of the equation we“ = n+ 1; 
the central terms are approximately Gaussian. There results the estimate, 


(43) Bn =nle7!(2n)74/?(1 +: u71)71/? exp (cra — ulogu) — su) (1+ O(e-“)). 
This example is taken from de Bruijn’s book [111, p. 108]. <q 


7. Mellin transform. The Mellin transform of a function f defined over R59 is the 
complex-variable function f*(s) defined by the integral 


(44) f*(s) =| f(a)?" de. 


This transform is also occasionally denoted by M[f] or M[f (x); s]. Its importance 
devolves from two properties: (i) it maps asymptotic expansions of a function at 0 
and +oo to singularities of the transform; (27) it factorizes harmonic sums (defined 
below). The conjunction of the mapping property and the harmonic sum property 
makes it possible to analyse asymptotically rather complicated sums arising from a 
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linear superposition of models taken at different scales. Major properties are summa- 
rized in Figure 4. In this brief review, detailed analytic conditions must be omitted: 
see [184] as well as comments and references at the end of this entry. 

It is assumed that f is locally integrable. Then, the two conditions, 


fey 08%; 18), = 06%, 


o+ L—+00 


guarantee that f* exists for s in a strip, 
8 € (-u,-v), ie., —u< R(s) < -v. 


Thus existence of the transform is granted provided v < u. The prototypical Mellin 
transform is the Gamma function discussed earlier in this appendix: 


I(s) := ‘| e *2*—* dx = M[e~*; s], 0 < R(s) < ow. 
0 


Similarly f(z) = (1 + x)~+ is O(x®) at 0 and O(a~') at infinity, and hence its 
transform exists in the strip (0,1); it is in fact 7/ sin 7s, as a consequence of the 
Eulerian Beta integral. The Heaviside function defined by H(z) := [0 < x < I] 
exists in (0, +oo) and has transform 1/s. 


Harmonic sum propery. The Mellin transform is a linear transform. In addition, 
it satisfies the simple but important rescaling rule: 


f(c) & f*(s) implies f(x) & w-* f*(s), 


for any js > 0. Linearity then entails the derived rule 
M —s * 
(45) So Ant (Hex) Fe (Ang ®) « £"(8), 
k 


valid a priori for any finite set of pairs (Ax, 4.) and extending to infinite sums when- 
ever the interchange of { and >> is permissible. A sum of the form (45) is called 
a harmonic sum, the function f is the “base function’, the \’s are the “amplitudes” 
and the y’s the “frequencies”. Equation (45) then yields the “harmonic sum rule”: 
The Mellin transform of a harmonic sum factorizes as the product of the transform of 
the base function and a generalized Dirichlet series associated to amplitudes and fre- 
quencies. Harmonic sums surface recurrently in the context of analytic combinatorics 
and Mellin transforms are a method of choice for coping with them. 
Here are a few examples of application of the rule (45): 


—k2 9? 1 —x2k T(s) 
=—T 2 Pei eek Ae 
de nije 2F(8/2)6(8) De eo: aS 


1 k>0 i 
—Vkx , ss i 
log k —¢'(8/2)0 e 
2 mee R(s)>2 O(s/2)P(s) Py k(R+ a) o<i(s)<1 <¢ * ins 


> 11.23. Connection between power series and Dirichlet series. Let (fn) be a sequence of 
numbers with at most polynomial growth, fr = O(n"), and with OGF f(z). Then, one has 


Me o = 70) [ f(e*)a° "de, Rs) >r +1. 


B. BASIC COMPLEX ANALYSIS 709 


Function (f(a) Mellin transform (f* (s) 


f(a) f(a)a*—! da definition, s ¢ (—u, —v) 
0 
ctico 
a f*(s)x * ds inversion th., —u <c < —v 


Qin c—too 


S> A fi(x) if; linearity 
f (ux) z scaling rule (44 > 0) 


a? f(x*) ) power rule 


a 


Nf (Mix) if - f*(s) harmonic sum rule (1; > 0) 


if A(t) f (ta) dt i A(t)t° dt. f*(s) harmonic integral rule 
) 0 


diff. I, k € Zyo, Os := 4 


mapping: x — 0, left poles 


mapping: x — oo, right poles 


FIGURE II.4.. A summary of major properties of Mellin transforms. 


For instance, one obtains the Mellin pairs 


oe (8) (s) (Rs) > 1), oF 


These serve to analyse sums or, conversely, deduce analytic properties of Dirichlet series. < 


M C(s+1)I(s) (R(s) > 0). 


—e-z 


Mapping properties. Mellin transfoms map asymptotic terms in the expansions 
of a function f at 0 and +00 onto singular terms of the transform f*. This property 
stems from the basic identities 


aM 1 


H(a)a% 6 (s € (—a, +00)), (1-H (a))2° mM 


sta s+ 

as well as what one obtains by differentiation with respect to a, 3. 
The converse mapping property also holds. Like for other integral transforms, 

there is an inversion formula: if f is continuous in an interval containing x, then 


(s € (—00, —§)), 


ctioo 
(46) f(x) = = / f*(s)a? ds, 


~~ Dit J pntes 
where the abscissa c should be chosen in the “fundamental strip” of f; for instance 
any c satisfying —u < c < —v with u, v as above is suitable. 
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In many cases of practical interest, f* is continuable as a meromorphic function 
to the whole of C. If the continuation of f* does not grow too fast along vertical lines, 
then one can estimate the inverse Mellin integral of (46) by residues. This corresponds 
to shifting the line of integration to some d ¥ c and taking poles into account by the 
residue theorem. Since the residue at a pole so of f* involves a factor of x~*°, the 
contribution of so will give useful information on f(x) as x — oo if so lies to the 
right of c, and on f(a) as x — O if so lies to the left. Higher order poles introduce 
additional logarithmic factors. The “dictionary” is simply 


1 Ma YE ke 
(47) (s— s0)PHE — a an . (log x) ; 
where the sign is ‘+’ for a pole on the left of the fundamental strip and ‘—’ for a pole 
on the right. 


Mellin asymptotic summation. The combination of mapping properties and the 
harmonic sum property constitutes a powerful tool of asymptotic analysis. As an 
example, let us first investigate the pair 


‘iS Payee" 2, 


22? Hine 1 
pul x 2 sin 57s 


where F* results from the harmonic sum rule and is is originally defined in the strip 
(1,2). The function is meromorphically continuable to the whole of C with poles at 


the points 0,1,2 and 4,6,8,.... The transform F™ is small towards infinity, so that 
application of the dictionary (47) is justified. One then finds mechanically: 
‘ie nt 


7 _2+0(™), Fle) 


He) oe 9 


2too 62 90a 7"? 


for any M > 0. 
A particularly important quantity in analytic combinatorics is the harmonic sum 


O(x) := s (1 - oe) : 


k=0 
It occurs for instance in the analysis of longest runs in words (p. 288). By the harmonic 
sum rule, one finds 


e--FO,, eng 
(The transform of e~* — 1 is also '(s), but in the shifted strip (—1,0).) The singu- 
larities of ®* are at s = 0, where there is a double pole, at s = —1, —2,... which are 
simple poles, but also at the complex points 
_ Qikr 
X= log? 


The Mellin dictionary (47) can still be applied provided one integrates along a long 
rectangular contour that passes in-between poles. The salient feature is here the pres- 
ence of fluctuations induced by the imaginary poles, since 


x ** = exp (—2ikm log, x), 
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and each pole induces a Fourier element. All in all, one finds (any M > 0): 


¥ 1 M 
7 O(z) ae log, x + log? +5 + P(x) + O(a") 
1 Qikn 
P — —_ T —2ikn logs x ; 
(2) log 2 LS (=) 5 


keEZ\{0} 
The analysis for x — 0 is also possible: in this particular case, it yields 
(-1)"-1 qh 
® ~ —_— — 
(2) Pa 1—2-" n! . 


which is what would result from expanding the exponential in ®(x) and reorganiz- 
ing the terms, and consequently constitutes an exact representation (i.e., “~’ can be 
replaced by ‘=’). 
> IL.24. Mellin-type derivation of Stirling’s formula. One has the Mellin pair 

L(x) = Slog (1 > ~) -F, 1s) =—"—<(-s), 8 € (-2,-1). 


ssin7s 
k>1 


Note that L(x) = log(e~7” /T'(1 + x)). Mellin asymptotics provides 


1 1 1 1 
L ~ l 1 al log V2a — —— =~ — = H+ 
Wie OB Sl Wt WEP = DBE aoe songs oedes 
where one recognizes Stirling’s expansion of z!, 
= Bon 1-2 

logal_~_ log (a*e* Viz) +0 Bam gi 

Gn el #25 InQn—-1) 
with B,, the Bernoulli numbers. <J 


> I1.25. Mellin-type analysis of the harmonic numbers. For a parameter a > 0, one has the 
Mellin pair: 


This serves to estimate harmonic numbers and their generalisations, for instance 


1 Bn ks 1 1 1 
eee we ON Pea Rs aE ree ice ale 580 
ee OER ee cu" OEY Sn ane Ont 


since Ki(n) = Hn. <J 


EXAMPLE II.3. Euler-Maclaurin summation via Mellin analysis. Let f be continuous on 
(0, +00) and satisfy f(a) =2++00 O(a~!~°), for some 5 > 0, and 


fe) ~ >) fea. 
a—Ot 
k=0 
The summatory function F'(x) satisfies 


F(e):= 0 f(ne), — F*(s) =C(s)F"(s), 


n>1 
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by the harmonic sum rule. The collection of (trimmed) singular expansions of f* at s = 
0, —1, —2,...is summarized by the formal sum 


x. — { fo fi fo 
j Oe) 2) Ge 


Thus, by the mapping properties, provided F'*(s) is small towards +ioo in finite strips, one has 


where the main term is associated to the singularity of F* at 1 and arises from the pole of ¢(s), 
with f*(1) giving the integral of f. The interest of this approach is that it is very versatile and 
allows for various forms of asymptotic expansions of f at 0 as well as multipliers like (—1)*, 
log k, and so on; see [184] for details and Gonnet’s note [241] for alternative approaches. 
END OF EXAMPLE II.3. 


General references on Mellin transforms are the books by Doetsch [131] and 
Widder [493]. The term “harmonic sum” and some of the corresponding technol- 
ogy originates with the abstract [204]. This brief presentation is based on the survey 
article [184] to which we refer for a detailed treatment. Mellin analysis of “harmonic 
integrals” is a classical topic of applied mathematics for which we refer to the books 
by Wong [502] and Paris-Kaminski [386]. Useful treatments of properties of use in 
discrete mathematics and analysis of algorithms appear in the books by Hofri [270], 
Mahmoud [351], and Szpankowski [458]. 


8. Several complex variables.. The theory of analytic (or holomorphic) functions of 

one complex variables extends nontrivially to several complex variables. This pro- 

found theory has been largely developed in the course of the twentieth century. Here 

we shall only need the most basic concepts, not the deeper results, of the theory. 
Consider the space C™ endowed with the metric 


m 
2) = |Ciisagee So als 
j=l 


under which it is isomorphic to the Euclidean space R?”. A function f from C™ to C 
is said to be analytic at some point a if in a neighbourhood of a it can be represented 
by a convergent power series, 
(49) 
i(Zs Feige 12m) = 0 fn(z-a)" = se tivo Fie ++ (Zm—Am)"™. 
n N1y.0-5m 

There and throughout the theory extensive use is made of multi-index conventions, as 
encountered in Chapter II. 

An expansion (49) converges in a polydise [] ;{|z; — aj| < rj}, for some r; > 0. 
A convergent expansion at (0,...,0) has its coefficients majorized in absolute value 
by those of a series of the form 
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From there, closure of analytic functions under sums, products, and compositions re- 
sult from standard manipulations of majorant series (see Chapter IV for the univariate 
case). Finally, a function is analytic in an open set 2. C C” iff it is analytic at each 
ae. 

A remarkable theorem of Hartogs asserts that f(z) with z € C” is analytic jointly 
in all the z; (in the sense of (49)) if it is analytic separately in each variable z;. (The 
version of the theorem that postulates a priori continuity is elementary.) 

Like in the one-dimensional case, analytic functions can be equivalently defined 
by means of differentiability conditions. A function is C-differentiable or holomorphic 
at aif as Az — 0 in C”, one has 


flat Az) — fla) =D ejAzy + 0([Azl). 
j=l 
The coefficients c; are the partial derivatives, c; = Oz, f(a). The fact that this relation 
does not depend on the way Az tends to 0 implies the Cauchy-Riemann equations. 
In a way that parallels the single variable case, it is proved that two conditions are 
equivalent: f is analytic; f is complex differentiable. 

Iterated integrals are defined in the natural way and one finds, by a repeated use 
of calculus in a single variable, 

1 ie 
ee (2im)” a [. (C1 — 21) +++ (Cm — 2m) eu 
where C; is a small circle surrounding z,; in the z;—plane. By differentiation under the 
integral sign, Equation (50) also provides an integral formula for the partial derivatives 
of f, which is the analogue of Cauchy’s coefficient formula. Iterated integrals are 
independent of details of the “polypath” on which they are taken, and uniqueness of 
analytic continuation holds. 

The theory of functions of several complex variables develops in the direction of 
an integral calculus that is much more powerful than the iterated integrals mentioned 
above; see for instance the book by Aizenberg and Yuzhakov [5] for a multidimen- 
sional residue approach. Egorychev’s monograph [147] develops systematic applica- 
tions of the theory of functions of one or several complex variables to the evaluation 
of combinatorial sums. Pemantle together with several coauthors [388, 389, 390] has 
launched an ambitious research programme meant to extract the coefficients of mero- 
morphic multivariate generating functions by means of this theory, with the ultimate 
goal of obtaining systematically asymptotics from multivariate generating functions. 
In contrast, see especially Chapter IX, we can limit ourselves to developing a pertur- 
bative theory of one-variable complex function theory. 

In the context of this book, the basic notion of analyticity in several complex vari- 
ables serves to confer a bona fide analytic meaning to multivariate generating func- 
tions. Basic definitions are also needed in the context of functions f defined implicitly 
by functional relations of the form H(z, f) = 0 or H(z,u, f) = 0, where analytic 
functions of two or more complex variables (like H) make an appearance. (See in 
particular the discussion of the analytic Implicit Function Theorem and the WeierstraB 
Preparation Theorem in this Appendix.) 


APPENDIX C 


Complements of Probability Theory 


This appendix contains entries arranged in logical order regarding the following topics: 


Probability spaces and measure; Random variables; Transforms of distributions; 

Special distributions; Convergence in law. 
In this book we start from probability spaces that are finite, since they arise from objects of a 
fixed size in some combinatorial class (see Chapter III of Part A and APPENDIX A: Combi- 
natorial probability, p. 671 for elementary aspects), then need basic properties of continuous 
distributions in order to characterize asymptotic limit laws. The entries in this appendix are 
used principally in Chapter IX of Part C relative to Random Structures. They present a unified 
framework that encompasses discrete and continuous probability distributions alike. 


1. Probability spaces and measure. An axiomatization of probability theory! was 
discovered in the 1930s by Kolmogorov. A measurable space consists of a set Q, 
called the set of elementary events or the sample set and a o-algebra A of subsets of 2 
called events (that is, a collection of sets containing § and closed under complement 
and denumerable unions). A measure space is a measurable space endowed with a 
measure 4 : A +> R>o that is additive over finite or denumerable unions of disjoint 
sets; in that case, elements of A are called measurable sets. A probability space is a 
measure space for which the measure satisfies the further normalization p(Q) = 1; in 
that case, we also write P for yw. Any set S C Q such that (S) = 1 is called a support 
of the probability measure. 


The definitions given above cover several important cases. 


(i) Finite sets with the uniform measure also known as “counting” measure. In 
this case, 2 is finite, all sets are in A (i.e., are measurable), and (|| - | denotes cardinal- 
ity) 

_ El 
ee | 
Nonuniform measures over a finite set Q are determined by assigning a nonnegative 
weight p(w) to each element of Q (with ).,¢g p(w) = 1) and setting 


p(E) = 5° ple). 
ecE 
(We also write P(e) for P({e}) = u({e}) = p(e).) In this book, 2 is usually the sub- 
class C,, formed by the objects of size n in some combinatorial class C. The uniform 
probability is normally assumed, although sometimes weighted models are consid- 
ered: see for instance in Chapter III the discussion of weighted word models and 
Bernoulli trials as well as the case of weighted tree models and branching processes. 


For this entry we refer to the vivid and well motivated presentation in Williams’ book [497] or to 
many classical treatises like the ones by Billingley [55] and Feller [161]. 


715 


716 C. COMPLEMENTS OF PROBABILITY THEORY 


(it) Discrete probability measures over the integers (supported by Z or Zo). In 
this case the measure is determined by a function p : Z +> Rso and 


w(E) = 5° ple), 
ecE 
with 44(Z) = 1. (All sets are measurable.) More general discrete measures supported 
by denumerable sets of R can be similarly defined. 

(itt) The real line R equipped with the o-algebra generated by the open intervals 
constitutes a standard example of a measurable space; in that case, any member of 
the o-algebra is known as a Borel set. The measure, denoted by ), that assigns to an 
interval (a, b) the value A(a, b) = b — a (and is extended nontrivially to all Borel sets 
by additivity) is known as the Lebesgue measure. The interval [0, 1] endowed with 
is a probability space. The line R itself is not a probability space since \(IR) = +o. 

In the measure-theoretic framework, a random variable is a mapping X from 
a probability space (equipped with its o-algebra A and its measure Pg) to R 
(equipped with its Borel sets B) such that the preimage X~'(B) of any B € B lies 
in A. For B € B, the probability that X lies in B is then defined as 


P(X € B):=Po(X7'(B)). 


Since the Borel sets can be generated by the semi-infinite intervals (— oo, a], this prob- 
ability is equivalently determined by the function 


F(a) :=P(X <2), 


which is called the distribution function or cumulative distribution function of X. 
It is then possible to introduce random variables directly by means of distribution 
functions, see the next entry below, Random variables. 

The next step is to go from measures of sets to integrals of (real valued) func- 
tions. Lebesgue integrals are constructed, first for indicator functions of intervals, 
then for simple (staircase) functions, then for nonnegative functions, finally for inte- 
grable functions. One defines in this way, for an arbitrary measure ji, the Lebesgue 
integral 


0) f fay, atsowrinen ff f(a)du(v) of fle)n(ae), 


where the last notation is often preferred by probabilists. The basic idea is to decom- 
pose the domain of values of f into finitely many measurable sets (A;) and, for a 
positive function f, consider the supremum over all finite decompositions (A;) 
(2) [fa = Hee int i) p( Aj). 
(Thus Riemman integration proceeds by decomposing the domain of the function’s 
arguments while Lebesgue integrals decomposes the domain of values and appeals to 
a richer notion of measure.) 

In (1) and (2), the possibility exists that ~ assigns a nonzero measure to cer- 
tain individual points. In such a context, the integral is sometimes referred to as 
the Lebesgue-Stieltjes integral. It suitably generalizes the Riemann-Stieltjes integral 
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which, given a real valued function M, defines the following extension of the standard 
Riemann integral: 


@) [ foyamt(e) = fam Fern) n, (00. 


There the B; form a finite partition of the domain in which the argument of f ranges, 
the limit is taken as the largest B;, tends to 0, each xx lies in B,, and Ag, (M) is 
the variation of MM on B,. The great advantage of Stieltjes (hence automatically of 
Lebesgue) integrals is to unify many of the formule relative to discrete and continuous 
probability distributions while providing a simple framework adapted to mixed cases. 


2. Random variables. A real random variable X is fully characterized by its (cumu- 
lative) distribution function 


Fx (x) := P(X <2), 


which is a nondecreasing right-continuous function satisfying F(—co) = 0, 
F(+oo0) =1. 

A variable is discrete if it is supported by a finite or denumerable set. Almost 
all discrete distributions in this book are supported by Z or Z>o. (An interesting 
exception is the collection of limit distributions occurring in longest runs of words; 
see Chapter IV.) 

A variable X is continuous if it assigns zero probability mass to any finite or 
denumerable set. In particular, it has no jump. An easy theorem states that any distri- 
bution function can be decomposed into a discrete and a continuous part, 


F(a) = F4(x) + ce2F°(a), ate=l. 


(The jumps must sum to at most |, hence their set is at most denumerable.) A variable 
is absolutely continuous if it assigns zero probability mass to any Borel set of mea- 
sure 0. In that case, the Radon Nikodym Theorem asserts that there exists a function 
w such that 


(There, in all generality, the Lebesgue integral is required but the Riemann integral is 
sufficient for all practical purposes in this book.) The function w(z) is called a density 
of the random variable X (or of its distribution function). When F'y is differentiable 
everywhere it admits the density 


by the Fundamental Theorem of Calculus. 
> IIL1. The Lebesgue decomposition theorem. It states that any distribution function F'(x) 
decomposes as 

F(z) = a F"(x) + coF* + ¢3F*(2), catete =1, 


where F’ is discrete, F'*“ is absolutely continuous, and F’* is continuous but singular, i.e., it 
is supported by a Borel set of Lebesgue measure 0. Singular random variables are constructed, 
e.g., from the Cantor set. dq 
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In this book, all combinatorial distributions are discrete (and then usually sup- 
ported by Z>,). All continuous distributions obtained as limits of discrete ones are, 
in our context, absolutely continuous and the qualifier “absolutely” is globally under- 
stood when discussing continuous distributions. 

If X is a random variable, the expectation of a function g(X ) is defined 


B((X) = f oX)aP = f o(edF@, 

where the latter form involves the distribution function F’ of X. In particular the 
expectation or mean of X is E(X), and generally its moment of order r is 
pane); 


(These quantities may not exist for r £ 0.) 
> IIl.2. Alternative formule for expectations. If X is supported by R>o and has a density: 


E(X) = iE (1 — F(2)) de. 


0 


If X is supported by Zo: 
E(X) = 5> P(X > k). 


k>0 
Prrofs are by partial integration and summation: for instance with py = P(X = k), 
E(X) = S~ kpx = (pit pe+p3t+-:-)+(p2tp3at+--+)+(p3t-.-)+---. 
k>1 


Similar formule hold for higher moments. <q 


3. Transforms of distributions. The Laplace transform of X (or of its distribution 
function F’) is defined? by 


Ax(s) = E(e**) =| e* dF(2), 


and is also known as the moment generating function (see below for an existential 
discussion). The characteristic function is defined by 


+00 
x(t) _—fFf (ee) =| ett dF (x), 


Co 


and it is a Fourier transform. Both transforms are formal variants of one another and 
x(t) = Ax (it). 

If X is discrete and supported by Z, then its probability generating function (PGF) 
is defined as 


Px(u) = E(u*) = S P(X = h)ut. 
keZ 
As an analytic object this always exists when X is nonnegative (supported by Z>0), 
in which case the PGF is analytic at least in the open disc |u| < 1. If X assumes 
arbitrarily large negative values, then the PGF certainly exists on the unit circle, but 


21f F has a discrete component, then integration is to be taken in the sense of Lebesgue-Stieltjes or 
Riemann-Stieltjes. 
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sometimes not on a larger domain. The precise domain of existence of the PGF as an 
analytic function depends on the geometric rate of decay of the left and right tails of 
the distribution, that is, of P(X = k) as k — too. The characteristic function of the 
variable X (and of its distribution function F'y) is 


ox (t) = E(e**) = Px(e*) = S > P(X = k)e™. 
keZ 


It always exists for real values of t. The Laplace transform of a discrete distribution is 


Ax (s) = E(e**) = Px(e*) = 5" P(X = ke. 


If X is a continuous random variable with distribution function F(x) and density 
w(x), then the characteristic function is expressed as 


ox (t) := E(e"*) = f ew(e) da. 


and the Laplace transform is 


s) := E(ee*) = | e®* w(x) dz. 
Ax(s) = Ble*) = f ew(a)d 


The Fourier transform always exists for real arguments (by integrability of the Fourier 
kernel e’’ whose modulus is 1). The Laplace transform, when it exists in a strip, 
extends analytically the characteristic function via the equality dx (t) = Ax (it). The 
Laplace transform is also called the moment generating function since an alternative 
formulation of its definition, valid for discrete and continuous cases alike, is 


gk 


Ax(s) = DEKE, 


k>0 


which indeed represents the exponential generating function of moments. (We prefer 
not to use this terminology so as to avoid a possible confusion with the many other 
types generating functions employed in this book.) 

> II.3. Centring, scaling, and standardization. Let X be a random variable. Define 


y= Ast, The representations as expectations of the Laplace transform of the characteris- 
tic function make it obvious that 


dy (t) =e **bx (=) < Seley Ser ag (=) 


oO 


One says that Y is obtained from X by centring (by a shift of j) and scaling (by a factor of o). 
If js and o are the mean and standard deviation of X, then one says that Y is a standardized 
version of X. <q 


> IIL4. Moments and transforms. The moments are accessible from either transform, 


d” 
ds” 


pw) = EXY"} = SX) 
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In particular, we have 


d d 
u = qs) me eer (t) 
=0 t=0 
a) w= £9) =- 4 
o s=0 dt ? t=0 
o = — log X(s) = — —~ log (t) 
ds? s=0 dt? t=0 


The direct expression of the standard deviation in terms of log A(s), called the cumulant gener- 
ating function, often proves computationally handy. dq 


> IIL5. Mellin transforms of distributions. The quantity M(s) := E(X*~') is called the 
Mellin transform of X (or of its distribution function F’), when X is supported by Ryo. In 
particular, if X admits a density, then this notion coincides with the usual definition of a Mellin 
transform. When it exists, the value of the Mellin transform at an integer s = k provides the 
moment of order k — 1. At other points, the Mellin transform provides moments of fractional 
order. dd 


> IIL.6. A “symbolic” fragment of probability theory. Consider discrete random variables 
supported by Z>0. Let X,X1,... be random variables with PGF p(w) and let Y have PGF 
q(u). Then, certain natural operations admit a translation into PGFs: 


Random sum 


Size bias 


(“Bern’” means a Bernoulli {0, 1} variable B and the switch is interpreted as BX + (1— B)Y. 
Size-biased distributions occur in Chapter VII.) <q 


The importance of these transforms derives from the existence of continuity the- 


orem by which convergence of distributions can be established via convergence of 
transforms. 


4. Special distributions. A compendium of special distribution is provided by Fig- 
ure 1. 

A Bernoulli trial of parameter q is an event that has probability g of having value 0 
(interpreted as “failure’”’) and probability p of having value | (interpreted as “success’’), 
with p+q = 1. Formally, this is the set Q = {0, 1} endowed with the probability mea- 
sure P(O) = g, P(1) = p. The binomial distribution (also called Bernoulli distribu- 
tion) of parameters 7, q is the random variable that represents the number of successes 
in n independent Bernoulli trials. This is the probability distribution associated with 
the game of heads-and-tails. The geometric distribution is the distribution of a ran- 
dom variable X that records the number of failures till the first success is encountered 
in a potentially arbitrarily long sequence of Bernoulli trials. By extension, one also 
refers to independent experiments with finitely many possible outcomes as Bernoulli 
trials. In that sense, the model of words of some fixed length over a finite alphabet and 
nonuniform letter weights (or probabilities) belongs to the category of Bernoulli mod- 
els; see Chapter III. The negative binomial distribution of index m (written N B[m]) 
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Distrib. Prob. (D), density (C) PGF(D), Char. function (C’) 
D_ Binomial (n, p) p’(l—p)"* (q+ pz)" 
: k l-q 
D Geometric (q) (1—q)q 
1—qz 
D_ Neg. binomial|m] (q) ( k )s (1—q) (; 42 
. 1 Ne log(1 — Az) 
D_ Log. » qn a ee 5/1) 
og. series (\) —Jog(1— A) Fl log(1 — A) 
k 
D_ Poisson (A) ree ene 
a 
C Gaussian or Normal, N’(0, 1) ms ee 
aussian or Normal, ; e€ 
V20 1 
C Exponential e” : 
1— Bs 
C Uniform [—4, +4] [-$ << +3] Sint 
RR IRS ee ar eg en) es 


FIGURE III.1. A list of commonly encountered discrete (D) and continuous 
(C) probability distributions: type, name, probabilities or density, probability 
generating function or characteristic function. 


and parameter q corresponds to the number of failures before m successes are en- 
countered. We have found in Chapter VII that it is systematically associated with the 
number of r—components in an unlabelled multiset schema F = It(G) whose com- 
position of singularities is of the exp-log type. The geometric distribution appears 
in several schemas related to sequences while the logarithmic series distribution is 
closely tied to cycles (Chapter V). 

The Poisson distribution counts amongst the most important distributions of prob- 
ability theory. Its essential properties are recalled in Figure 1. It occurs for instance in 
the distribution of singleton cycles and of r-cycles in a random permutation and more 
generally in labelled composition schemes (Chapter IX). 

In this book all probability distributions arising directly from combinatorics are a 
priori discrete as they are defined on finite sets—typically a certain subclass C,, of a 
combinatorial class C. However, as the size n of the objects considered grows, these 
finite distributions may approach a continuous limit. In this context, by far the most 
important law is the Gaussian law also known as normal law, which is defined by its 
density and its distribution function: 


—2? /2 1 = 2 
(5) g(2)=—, Wa) = | eae 


The corresponding Laplace transform is then evaluated by completing the square: 


1 igo 
A(s) = a | ee ay mer, 
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Characteristic function ((t)) 
(0) =1 
|@(to)| = 1 for some to 4 0 


g(t) Ser 1+ ipt + o(t) 


Distribution function (F'(x)) 
F'(—oo) = 0, F(+00) = 1 
Lattice distribution, span ot 
u(X) = ps < co 


ie 2 
o(t) rea 1+ a les + o(t*) 
log $(t) = —5 

o(t) = Oast > co 

@(t) integrable (is in £1) 


aX 7) y's 06 


xX £N(0,1) 
X is continuous 
X is absolutely continuous 


+00 
density is w(x) = all e** b(t) dt. 
T J—oo 
X(s) := o(—is) exists ina < R(s) < 8 
limp soo se ftp |O(O/? at 


én (t) — (t) (point conv.) 


Exponential tails 
equals }~,(p;)*; the p; are the jumps 
F,-2>F (weak conv.) 


Xx. jx (conv. in distribution) 
F, “close” to F' (Berry-Esseen) 


dn “close” to 6 


FIGURE III.2. The correspondence between properties of the distribution func- 
tion (F’) of a random variable (X ) and properties of the corresponding character- 
istic functions (¢). 


and, similarly, the characteristic function is ¢(t) = e'/2_ The distribution of (5) is 
referred to as the standard normal distribution, N’(0, 1); if X is (0, 1), the variable 
Y = u+oX defines the normal distribution with mean yu and standard deviation o, 
denoted N(y1, 0). 

Amongst other continuous distributions appearing in this book, we mention the 
theta distributions associated to the height of trees and Dyck paths (Chapter V) and 
the stable laws alluded to in Chapter VI. 


5. Convergence in law. Let F;, be a family of distribution functions F;,. We say 
generally that the F;, converge weakly to a distribution function F' if pointwise 


(6) lim F, (2) = F(z), 


for every continuity point x of fF’. This is expressed by writing F,, => F as well 


as Xx. if X,, X are random variables corresponding to F;,, /’. We say that 
Xp, converges in distribution or converges in law to X. For discrete distributions 
supported by Z, and equivalent form of (6) is lim, F,,(k) = F(k) for each k € Z; 
for continuous distributions, Equation (6) just means that lim, F,(z) = F(x) for 
all « € R. Although in all generality anything can tend to anything else, due to the 
finite nature of combinatorics, we shall only need in this book the convergences 


Discrete = Discrete, Discrete = Continuous (after standardization). 
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Properties of random variables are reflected by probabilities of characteristic 
functions and Figure 2 offers an apercu. Most important for us is the Continuity 
Theorem of characteristic functions due to Lévy and stated in Chapter IX. The Berry— 
Esseen inequalities also stated in Chapter IX lie at the origin of precise speed of con- 
vergence estimates to asymptotic limits. 
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R (resultant notation), 685 algebraic curve, 471 

[z”] (coefficient extractor), 19 algebraic function, 492, 505 

= (numeric approximation), | asymptotic, 492 

E (expectation), 104, 672, 718 branch, 471 

& (imaginary part), 217 coefficient, 477-492 

Q (asymptotic notation), 668 elimination, 685-687 

P (probability), 104, 145 Newton polygon, 474-476 

RR (real part), 217 Puiseux expansion, 474-476 

© (asymptotic notation), 668 singularities, 471-492 

V (variance), 672 algebraic topology, 189 

& (asymptotic notation), 670 algorithm 

>< (exponential order), 230 approximate counting, 291-292 

O (asymptotic notation), 668 balanced tree, 84, 267 

o (substitution), 79 binary adder, 285 

= (combinatorial isomorphism), 18 binary search tree, 192, 410-412 

m (analytic mean), 601 digital tree (trie), 340 

v (analytic variance), 601 hashing, 103, 167, 558 

(-) (strip of C), 708 irreducible polynomials, 432 

[-| (nearest integer function), 41 polynomial factorization, 432 

[ - | (rounding notation), 246 shake and paint, 398 

¢ (contour integral), 514 TCP protocol, 292 

O (derivative), 80 alignment, 110-316 

o (standard deviation), 673 alkanes, 456-458 

~ (asymptotic notation), 668 allocation, see balls-in-bins model, 103-110 

x (labelled product), 92 alphabet, 47 

lg (binary logarithm), 286 ambiguity 

o (asymptotic notation), 668 context-free grammar, 76 

Reonv (radius of convergence), 218 regular expression, 293, 679 

Res (residue operator), 221 analytic continuation, 226 

+, see disjoint union analytic function, 218-226 

[-] (verson’s notation), 54 equivalent definitions, 687-689 
composition, 393-399 

CYC (cycle construction), 24, 95 differentiation, 400-404 

MSET (multiset construction), 25 Hadamard product, 404-409 

PSET (powerset construction), 25 integration, 400-404 

SEQ (sequence construction), 24, 94 inversion, 236, 261-266, 385-390 

SET (set construction), 94 iteration, 267-269 

© (pointing), 79 Lindel6f integrals, 225 

aperiodic, 461 

Abel identity, 678 aperiodic (GF), 314 

Abel—Plana summation, 226 approximate counting, 291-292 

adjacency matrix (of graph), 321 area (of Dyck path), 307 

admissibility (of function), 528-540 argument principle, 256 

admissible construction, 21, 91 arithmetical functions, 667 

Airy area distribution, 349 arithmetical semigroups, 83 

Airy function, 540, 563, 654, 661 arrangement, 104, 105 

alcohol, 270, 456 asymptotic 
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algebraic, 492 
expansion, 669 
notations, 668-671 
scale, 669-670 
atom, 23, 90 
autocorrelation (in words), 56, 257 
automaton 
finite, 52 
average, see expectation 


balanced tree, see tree 
ballot numbers, 63 
ballot problem, 72 
balls-in-bins model, 104, 165-167 
capacity, 556-558 
Poisson law, 166 
Bell numbers, 101 
asymptotics, 525-527, 707 
Bell polynomials, 177 
Bernoulli numbers, 254 
Bernoulli trial, 179, 285, 720 
Beta function (B), 693 
BGF, see bivariate generating function 
bijective equivalence (&), 18 
binary decision tree (BDT), 74 
binary search tree, 410-412 
binary search tree (BST), 192 
binary tree, 682 
binomial coefficient, 92 
asymptotics, 364-368 
central approximation, 706-707 
sum of powers, 706-707 
binomial convolution, 92 
binomial distribution, 720 
birth and death process, 296 
birth process, 290 
birthday paradox, 105-110, 180, 181, 397 
bivariate generating function (BGF), 145 
Boltzmann model, 266, 531 
boolean function, 73 
bootstrapping, 286 
bordering condition (permutation), 191 
Borges, Jorge Luis, 58 
boxed product, 129-132 
branch (of curve), 471 
branch point (analytic function), 263 
branch point (function), 218 
branching processes, 185-186 
bridge (lattice path), 73, 482-488 
Brownian motion, 174, 344, 395, 443, 653 
Biirmann inversion, see Lagrange inversion 


canonicalization, 80 
cartesian product construction (x), 22 
Catalan numbers (C,,), 17, 33-34, 36, 63, 68— 
74, 683 
asymptotics, 367 
generating function, 33 
Catalan sum., 399 
Catalan tree, 33, 162 
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Cauchy’s residue theorem, 222 
Cauchy—Riemann equations, 688 
Cayley tree, 117-119, 168 
Cayley tree function, see Tree function (Z’) 
central limit law, 573 
centring (random variable), 719 
Chebyshev inequalities, 150, 674 
Chebyshev polynomial, 304 
circuit (in graph), 321, 329 
circular graph, 91 
class (labelled), 87-138 
class (of combinatorial structures), 16 
cloud, 627 
cluster, 198, 200 
coalescence of saddle point 
with other saddle point, 563 
with roots, 552 
with singularity, 552-553 
code (words), 58 
coding theory, 37, 50, 58, 233 
coefficient extractor ([z™]), 19 
coin fountain, 308, 615 
combination, 48 
combinatorial 
class, 16, 88 
isomorphism (&), 18 
parameter, 139-208 
sums, 396-399 
combinatorial chemistry, 452-458 
combinatorial identities, 693-698 
combinatorial probability, 671-674 
combinatorial schema, see schema 
complete generating function, 174-187 
complex differentiability, 219 
complex dynamics, 267 
complexity theory, 73 
composition (of integer), 37-46 
Carlitz type, 190, 195, 249 
complete GF, 176 
cyclic (wheel), 45 
largest summand, 158, 317, 320 
local constraints, 187-189, 249 
number of summands, 42, 156-157 
prime summands, 41, 317-319 
profile, 158, 316 
r-parts, 157 
restricted summands, 317-319 
composition (singular), 393-399 
computable numbers, 237 
computer algebra, see symbolic manipulation 
concentration (of probability distribution), 150— 
151 
conformal map, 219 
conjugacy principle (paths), 71 
connection problem, 480, 481, 496, 499 
constructible class, 237—242 
construction 
cartesian product (x), 22 
cycle (CYC), 24, 154, 674-676 
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labelled, 95, 163 Dirichlet generating function (DGF), 667 
disjoint union (++), 24 disc of convergence (series), 218 
implicit, 81-84 discrete random variable, 717 
labelled product (x), 92-94 discriminant (of a polynomial), 471 
multiset (MSET), 25, 154 discriminant (of polynomial), 687 
pointing(O), 79-81, 187 disjoint union construction (+), 24, 91 
powerset (PSET), 25, 154 distribution, see probability distribution 

labelled, 163 distribution function (random variable), 717 
sequence (SEQ), 24, 154 divergent series, 82, 128, 676 

labelled, 94, 163 dominant singularity, 230 
set (SET) double exponential distribution, 286 

labelled, 94 Drmota-Lalley-Woods Theorem, 464 
substitution (0), 79-81, 187-190 drunkard problem, 82, 406-409 

context-free Dyck path, 73, 486 
asymptotics, 423, 460-462 area, 307 
language, 76-77, 460 height, 303-307 
specification, 74-77, 460-464 dynamical source, 295 
continuant polynomial, 298 
continuation (analytic), 226 EGF, see exponential generating function 
continued fraction, 184, 205, 270, 295-313 Ehrenfest urn model, 109, 313 
continuous random variable, 717 eigenvalue, see matrix 
contour integral (f ), 514 EIS (Sloane’s Encyclopedia), 17 
convergence in probability, 150 elimination (algebraic function), 685-687 
convexity (of GFs), 266 elliptic function, 307 
convexity inequalities, 516 entire function, 230 
correlation, see atocorealtion1 entropy, 549 
coupon collector problem, 105-110, 180 Euler numbers, 134 
cover time (walk), 347 Euler’s constant (7), 108, 691 
covering (of interval), 25 Euler—Maclaurin summation, 226, 711 
cumulant generating function, 720 Eulerian numbers, 198, 612 
cumulated value (of parameter), 147 Eulerian tour (in graph), 338 
cumulative generating function, 147 exceedances (in permutations), 352 
cycle construction (CYC), 24, 154, 674-676 excursion (lattice path), 73, 296, 482-488 
labelled, 95, 163 exp-log transformation, 27, 79 
undirected, labelled, 123 expectation (or mean, average), E, 104, 146, 
cycle lemma (paths), 71 672, 718 
cyclic permutation, 91 exponential families (of functions), 186 
exponential generating function 
Daffodil Lemma, 253 definition, 89 
Darboux’s method, 417 product, 92 
data compression, 261 exponential growth formula, 230-236 
data mining, 399 exponential order (><), 230 
de Bruijn graph, 337-339 exponential polynomial, 242, 276-278 
Dedekind 7 function, 544 
degree (of tree node), 682 Faa di Bruno’s formula, 177 
density (random variable), 717 factorial moment, 673 
denumerant, 41, 244-245 factorial moments, 147 
dependency graph, 325 factorial, falling, 494, 696 
derangement, 113, 196, 248, 352, 430 Ferrers diagram, 38 
derivative (0), 80 Fibonacci numbers, 40, 55 
devil’s staircase, 336-337 Fibonacci polynomial, 304 
dice games, 549 finite automaton, 52, 323-340 
Dickman function, 591 finite field, 83 
differential equations, 492-503, 693-698 finite language, 61 
differential field, 496 finite state model, 334, 342-351 
differentiation (singular), 400-404 forest (of trees), 62, 119, 681 
digital tree (trie), 340 formal language, see language 
digraph, see graph, 321 formal power series, see power series 
dilogarithm, 392 formal topology (power series), 676 


directed graph, 321 four-colour theorem, 489 


746 


Fourier transform, 718 
fractals, 269 
fragmented permutation, 115 
asymptotics, 234, 527-528 
free group, 194-195 
free tree, see tree, unrooted 
function (of complex variable) 
analytic, 218-226 
differentiable, 219 
entire, 219, 230 
holomorphic, 219 
meromorphic, 220 
functional equation, 261-272 
kernel method, 483 
quadratic method, 490 
functional graph, 119-122, 458-459 
Fundamental Theorem of Algebra, 256, 512 


Galton-Watson process, 185 
gambler ruin sequence, 72 
Gamma function (I), 362, 689-693 
Gaussian binomial, 43 
Gaussian distribution, 555-556, 721 
Gaussian integral, 690 
general tree, 683 
generating function 
algebraic, 492 
complete, 174-187 
exponential, 87-138 
multivariate, 139-208 
ordinary, 15-86 
geometric distribution, 720 
Gessel’s calculus, 697-698 
GF, see generating function 
golden ratio (~), 40, 84 
graph 
acyclic, 122 
adjacency matrix, 321 
aperiodic, 325 
bipartite, 128 
circuit, 321, 329 
circular, 91 
colouring, 489 
connected, 127-129 
de Bruijn, 337-339 
directed, 321 
enumeration, 96-97 
excess, 123 
functional, 119-122, 458-459 
labelled, 88-89, 96-97, 122-125 
map, 488-492 
non-crossing, 462-464, 478-479 
path, 320-340 
periodic, 1, 325 
random, 124—125 
regular, 124, 177, 363, 430, 698 
spanning tree, 323 
strongly connected, 325 
unlabelled, 96-97 
Green’s formula, 688 
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Groebner basis, 76, 685 
group 
free, 194-195 
symmetric, 128-129 


Hadamard product, 280, 404-409, 694 
Hamlet, 51 
Hankel contour, 365, 690 
Hardy—Ramanujan—Rademacher 
545 
harmonic function, 688 
harmonic number (H,,), 108, 149, 371, 669 
asymptotics, 711 
generating function, 149 
harmonic sum, 708 
Hartogs’ Theorem, 713 
hashing algorithm, 103, 167, 558 
Hayman admissibility, 528-540 
Heaviside function, 708 
height (of tree), 304-307 
Hermite polynomial, 312 
hidden pattern, 292-295 
hierarchy, 119, 266, 451-452 
Hipparchus, 64 
histograms, 146 
holomorphic function, 219 
holonomic functions, 426, 471, 492, 693-698 
homotopy (of paths), 221 
horse kicks, 584 
hypergeometric function 
basic, 292 
hypergeometric function (2 F',), 404, 500, 696 


expansion, 


implicit construction, 81-84, 127-129, 192-195 

Implicit Function Theorem, 698-700 

inclusion-exclusion, 195-202, 352 

increasing tree, 132-136, 191-192, 500-502 

Indoeuropean languages, 451 

inheritance (of parameters), 151, 163 

integer composition, see composition (of inte- 
ger) 

integer partition, see partition (of integer) 

integration (singular), 400-404 

interconnection network, 310 

inversion (analytic), 236, 385-390 

inversion table (permutation), 135 

involution, 310 

involution (permutation), 113 

involution (permutation), 524-525 

irregular singularity (ODE), 493 

isomorphism (combinatorial, ~), 18 

iteration, 267 

iteration (of analytic function), 267-269 

iterative specification, 30-32, 237-242 

Iverson’s notation ([[-]]), 54 


Jacobi trace formula, 323 


kernel method (functional equation), 483 
Knuth—Ramanujan function, see Ramanujan’s 
Q-function 


labelled class, object, 87-138, 163-170 
labelled construction, 92-98 
labelled product (x), 92 
Lagrange inversion, 62-66, 118, 677-678 
Lambert W—function, 118 
language, 678 
context-free, 76-77, 460 
formal, 47 
regular, 356, 678-680 
Laplace transform, 696 
Laplace’s method, 559, 700-707 
for sums, 706-707 
Laplacian, 688 
of graph, 323 
large deviations, 549 
large powers, 547-556 
largest components, 320 
Latin rectangle, 698 
lattice path, 72-73, 295-313, 482-488 
decompositions, 297 
lattice points, 46 
Laurent series, 482 
law of large numbers, 147, 673 
law of small numbers, 584 
leaf (of tree), 170, 682 
Lebesgue integral, 716 
Lebesgue measure, 716 
letter (of alphabet), 47 
light bulb, 609 
limit law, 569-663 
Lindel6f integrals, 225 
linear fractional transformation, 300 
Liouville’s theorem, 225 
local limit law, 555, 573 
localization (of zeros and poles), 256 
logarithm, binary (1g), 286 
logarithmic-series distribution, 316 
logic (first-order), 445 
logics, 445 
longest run (in word), 285-289 
loop (in complex region), 221 
Lukasiewicz codes, 71, 486-487 
Lyndon words, 675 


MacMahon’s Master Theorem, 323 
magic duality, 225 
majorant series, 236-237 
map, 488-492, 659-661 
mapping, 119-122, 431, 443-445, 658 
idempotent, 535 
regressive, 135 
mapping pattern, see functional graph 
mark (in combinatorial specification), 156 
marking variable, 19, 152 
Markov chain, 53, 323, 617 
Markov-Chebyshev inequalities, 150, 674 
Master Theorem (of MacMahon), 323 
matrix 
aperiodic, 325 
irreducible, 325 
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nonnegative, 326 
Perron Frobenius theory, 325-326, 329 
positive, 326 
spectrum, 276 
stochastic, 323, 335 
trace, 323 
transfer, 342-351 
tridiagonal, 351 
matrix tree theorem, 323 
Maximum Modulus Principle, 511 
mean, see expectation 
meander (lattice path), 73, 482-488, 594-595 
meander (topology), 500 
measure theory, 715-717 
Meinardus’ method (integer partitions), 545— 
546 
Mellin transform, 288, 306, 707-712 
ménage problem, 351 
meromorphic function, 220 
MGEF, see multivariate generating function 
mobile (tree), 436 
Mobius function (), 667 
Mobius inversion, 81, 431, 668 
modular form, 544 
moment inequalities, 150-151, 674 
moment method, 295 
moments (of random variable), 146, 672, 718 
monkey saddle, 510, 558-564 
monodromy, 474 
Motzkin numbers, 63, 73, 81 
asymptotics, 379, 478 
Motzkin path, 73, 303, 307, 486 
multinomial coefficient, 92, 175 
multiset construction (MSET), see construction, 
multiset, 154 
multiset construction (mset), 25 
Multiset construction., 25 
multivariate generating function (MGF), 139- 
208 


naming convention, 19, 90 

Narayana numbers, 171 

natural boundary, 236 

nearest integer function ([-|), 41 

necklace, 18, 60 

negative binomial distribution, 433, 721 
network, 310 

neutral object, 23, 90 

Newton polygon, 474-476 

Newton’s binomial expansion, 33 

nicotine, 20 

non-crossing configuration, 462-464, 478-479 
nonplane tree, 66-68, 117 

Norlund-Rice integrals, 226 

normal distribution, see Gaussian distribution 
numeric approximation (=), 1 

numerology, 295 


O (asymptotic notation), 668 
o (asymptotic notation), 668 
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ODE (ordinary differential equation), see differ- 
ential equations 

OGF, see ordinary generating function 

order constraints (in constructions), 129-136, 
190-192 

ordinary generating function (OGF), 18 

ordinary point (analytic function), 509 

orthogonal polynomials, 300, 309 

oscillations (of coefficients), 251, 269, 368 

outdegree, see degree (of tree node) 


pairing (permutation), 113 
parameter (combinatorial), 139-208 
cumulated value, 147 
inherited, 151-154 
recursive, 170-174 
parenthesis system, 73 
parse tree, 76 
partially commutative monoid, 285 
partition 
of sets, see set partition 
partition (of integer), 37-46 
asymptotics, 235 
denumerant, 41, 244-245 
distinct summands, 546 
Durfee square, 43 
Ferrers diagram, 38 
Hardy—Ramanujan—Rademacher 
545 
largest summand, 42 
Meinardus’ method, 545-546 
number of parts, 547 
number of summands, 42, 160 
plane, 546 
prime summands, 546-547 
profile, 160 
r-parts, 161 
partition of set, see set partition 
path (in graph), 321 
path (in complex region), 221 
path length, see tree 
patterns 
in permutations, 200 
in trees, 202 
in words, 50-52, 55-58, 200, 257-261, 292- 
295, 612-613, 617 
pentagonal numbers, 46 
period (of sequence, GF), 252 
periodicity (of coefficients), 250 
periodicity (of GF), 314 
permutation, 90, 110-114 
alternating, 132-134, 255 
ascending runs, 197-200, 611-612 
bordering condition, 191 
cycles, 110-114, 143, 164-165, 430, 600- 
601 
cycles of length m, 583 
cyclic, 91 
derangement, 113, 196, 248, 352, 430 
exceedances, 352 


expansion, 
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fixed order, 533 
increasing subsequences, 538-540 
indecomposable, 82, 129 
inversion table, 135 
involution, 113, 235, 310, 524-525, 538 
local order types, 191 
longest cycle, 113, 533 
longest increasing subsequence, 200, 538- 
540, 698 
ménage, 351 
pairing, 113 
pattern, 200 
profile, 164 
record, 130-131 
records, 600-601 
rises, 197-200 
shortest cycle, 113, 248 
singletons, 579-580 
tree decomposition, 132-134 
Perron Frobenius theory, 325-326, 329 
perturbation theory, 553 
PGF, see probability generating function 
phase transition, 650 
phase transition diagram, 650 
phylogenetic trees, 119 
Picard approximants, 699 
Plana’s summation, 226 
plane partition (of integer), 546 
plane tree, 61-66 
pointing construction (©), 79-81, 126-127, 187 
Poisson distribution, 721 
Poisson law, 165, 432 
Poisson-Dirichlet process, 591 
Polya operators, 32 
Polya operators, 239, 429 
P6élya—Carlson Theorem, 240 
polydisc, 712 
polylogarithm, 225, 390-393, 695 
polynomial 
primitive, 342 
polynomial (finite field), 83, 431-432 
polynomial system, 465, 470 
polyomino, 43, 189, 190, 308, 348-351, 613- 
615 
power series, 15, 18, 89, 141, 152, 176, 676-677 
convergence, 676 
divergent, 82, 128, 676 
formal topology, 676 
product, 676 
quasi-inverse, 676 
sum, 676 
powerset construction (PSET), 25, 154 
labelled, 163 
powerset construction (SET), see construction, 
powerset 
preferential arrangement numbers, 100 
prime number, 215-216 
principal determination (function), 217 
Pringsheim’s theorem, 227 
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prisoners, 114, 165 specification, 278-285 
probabilistic method, 674 regular point (analytic function), 227 
probability (P), 104, 145 regular singularity (ODE), 493-500 
probability distribution relabelling, 92 
Airy area, 349 renewal process, 320, 609 
Bernoulli, 720 Res (residue operator), 221 
binomial, 720 residue, 221-226 
double exponential, 109, 286-289 Cauchy’s theorem, 222 
Gaussian, 555-556, 721 resultant (R), 76, 685-687 
geometric, 720 Rice integrals, see Nérlund-Rice integrals 
geometric—birth, 291 Riemann surface, 227 
logarithmic series, 316 Rogers-Ramanujan identities, 308 
negative binomial, 433, 721 rotation correspondence (tree), 69 
Poisson, 432, 721 Rouché’s theorem, 256 
Rayleigh, 107, 655 round (children’s), 379 
stable laws, 395 rounding notation ([ - |), 246 
theta function, 305, 344 RV, see random variable 
Tracy—Widom, 540 
Zipf laws, 657 SA-class (singularity analysis), 383 
probability generating function, 673 saddle point : 
probability space, 715 analytic function, 510-512 
profile (of objects), 158, 432-434 bounds, 233, 512-516, 548 
pruned binary tree, 683 large powers, 547-556 
psi function (~), 692 method, 507-565 
Puiseux expansion (algebraic function), 474— multiple, 558-564 
476 scaling (random variable), 719 
schema (combinatorial-analytic), 159-160, 167, 
q-calculus, 292, 308 169-170 
q-calculus, 46 exp-log, 427-434 
quadratic method (functional equation), 490 supercritical sequence, 313-320 
quadtree, 497-500 Schréder’s problems, 64, 119, 453 
quasi-inverse, 32 self-avoiding configurations, 347-349 
semantics of recursion, 31 
R (resultant notation), 685 sequence construction (SEQ), 24, 154 
radioactive decay, 584 labelled, 94, 163 
radius of convergence (series), 218, 231 series 
Ramanujan’s @-function, 106, 120, 397-399 algebraic, 492 
random generation, 73, 320 series-parallel network, 64, 65, 68 
random matrix, 539 set construction (SET), see construction, set 
random variable, 672, 715-723 labelled, 94 
continuous, 717 set partition, 59-60, 98-110, 167 
density, 717 asymptotics, 235, 525-527 
discrete, 717 block, 100 
random variable (discrete), 145 largest block, 533 
random walk, see walk number of blocks, 167, 536-537 
rational function, 224, 242-245, 256-257 several complex variables, 712-713 
positive, 340, 341 shuffle product, 283 
Rayleigh distribution, 655 sieve formula, see inclusion-exclusion 
record Simon Newcomb’s problem, 181-182 
in permutation, 130-131 simple variety (of trees), 182, 307, 434 
in word, 178 singular expansion (function), 376 
recurrence singularity 
tree, 409-414 regular (ODE), 493-500 
recursion (semantics of), 31 singularity (of function), 226-230 
recursive parameter, 170-174 dominant, 230 
recursive specification, 30-32 singularity analysis, 359-420 
region (of complex plane), 217 applications, 421-506 
regular size (of combinatorial object), 16, 88 
expression, 356, 678-680 size-biased (probability), 442 


language, 278-285, 356, 678-680 Skolem-Mahler-Lerch Theorem, 252 
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slow variation, 416 
Smirnov word, 193, 249, 289, 333 
society (combinatorial class), 535 
spacings, 48 
span (of sequence, GF), 252 
spanning tree, 323 
special functions, 693-698 
species, 29, 86, 127, 137 
specification, 31 
iterative, 30-32 
recursive, 30-32 
spectrum, see matrix 
stable laws, see probability distribution 
standard deviation, (o), 673 
standardization (random variable), 572, 719 
statistical physics, 44, 189, 346-347, 422, 500, 
650 
steepest descent, 509, 513 
Stieltjes integral, 716-717 
Stirling numbers, 680-681 
cycle (1st kind), 111, 144, 600 
partition (2nd kind), 59-60, 100, 168 
Stirling’s approximation, 36, 389, 393, 520— 
522, 692-693, 705-706, 711 
strip ({-)), 708 
subcrititical composition schema, 585-590 
subexponential factor, 231 
subsequence statistics, see hidden patterns, 
words 
substitution construction (0), 79-81, 187-190 
supercritical cycle, 399 
supercritical sequence, 313-320, 394 
supernecklaces, 115 
supertree, 394-396, 479, 661 
support (of probability measure), 715 
support (of sequence, GF), 252 
surjection, 98-110, 316 
asymptotics, 246 
complete GF, 176 
surjection numbers, 101, 254 
symbolic manipulation, 240 
symbolic methods, 15 
symmetric functions, 177, 697-698 


Tauberian theory, 416 
Taylor expansion, 190, 688 
theory of species, 127 
theta function, 305-307, 344 
threshold phenomenon, 199 
tiling, 344-347, 617 
total variation distance (probability), 579 
totient function (~), 26, 667 
trace monoid, see partially commutative monoid 
trains, 240-242, 380 
transfer matrix, 342-351 
transfer theorem, 372-375 
tree, 30, 61-68, 116-125, 681 
additive functional, 438-443 
balanced, 83, 267-269 
binary, 63, 682 


INDEX 


branching processes, 185-186 
Catalan, 33 
Cayley, 117-119 
degree profile, 182-183, 441-442 
exponential bounds, 264-266 
forests, 62 
general, 30, 683 
height, 205, 304-307, 440 
increasing, 132-136, 191, 500-502 
leaf, 170, 619, 682 
level profile, 183-184, 439-440 
Lukasiewicz codes, 71 
mobile, 436 
non-crossing, 462-464, 478-479 
nonplane, 66-68, 443, 453 
nonplane, labelled, 117 
parse tree, 76 
path length, 172-174, 184, 442-443 
pattern, 202 
plane, 61-66, 682 
plane, labelled, 117 
quadtree, 497-500 
regular, 63 
root subtrees, 589 
root-degree, 162, 168, 437-438, 587-588 
rooted, 681 
search, 192 
simple variety, 182, 307, 387-390, 434-445, 
552 
supertree, 394-396, 479 
t-ary, 63 
unary-binary, 63, 81 
unrooted, 459-460 
width, 342-344, 659 
tree concepts, 681-683 
Tree function (T’), 386-389 
tree recurrence, 409-414 
triangulation (of polygon), 19, 33-34, 75 
tridiagonal matrix, 351 
trinomial numbers, 551 
truncated exponential, 102 


unambiguous, see ambiguity 
uniform expansions 

singularity analysis, 618-619 
uniform probability measure, 672 
uniformity (asymptotic expansions), 671 
uniformization (algebraic function), 473 
universality, 563 
unlabelled structures, 151-163 
urn, 91 
urn model, 109, 313, 503 


Vallée’s identity, 29 

valley (saddle point), 510 

variance (V), 672 

Vitali’s theorem (analytic functions), 580 


w.h.p. (with high probability), 125, 150 
walk, 351 


INDEX 


birth type, 289-292 
cover time, 347 
devil’s staircase, 336-337 
first return, 82 
in graphs, 320-340 
integer line, 296-301 
interval, 296-307 
lattice path, 72-73, 295-313, 482-488 
self-avoiding, 347-349 
Wallis integral, 693, 703 
Weierstrass Preparation Theorem (WPT), 699— 
700 
wheel, 45 
width (of tree), 342-344 
winding number, 256 
word, 47-61, 103-110 
aperiodic, 675 
code, 58 
excluded patterns, 339 
language, 47, 678 
local constraints, 333 
longest run, 285-289 
pattern, 50-52, 55-58, 200, 257-261, 292- 
295, 612-613, 617 
record, 178 
runs, 48-50, 193 
Smirnov, 193, 249, 289, 333 


Young tableau, 698 


zeta function of graphs, 330 
zeta function, Riemann (¢), 215, 255, 390, 692, 
697 
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